Sunday, January 21, 2024

Questioning AI Alignment

I strongly suspect that AI alignment can’t mean what a lot of us think it means, because the popular framing of the problem includes category errors. For example, here is the first sentence of OpenAI’s blog on their approach:

Our alignment research aims to make artificial general intelligence (AGI) aligned with human values and follow human intent.

This evokes, to me, the controversial idea of evolutionary group selection: that our evolutionary impulses direct us to improve species outcomes, rather than our own. Generally speaking, forces that act on individuals are much easier to characterize and have much greater predictive power than forces that act on groups. When one encounters a framing like this, where whole sets of individuals are treated as a category, think explicitly about the elements of that set, and see what happens. Let’s look again at that opening sentence, but this time highlighting cases where sets are treated as indivisible:

Our [1] alignment research aims to make artificial general intelligence (AGI) [2] aligned with human [3] values and follow human [3] intent.

1: “Our” alignment research. In this case, “our” presumably means the portion of OpenAI’s research division performing this class of research. In this case, using a set instead of individuals is very likely justified, as a blog like this is an official publication of the company, and though there are occasional exceptions, the individuals working for the company are expected to align with its stated goals.

2: “AGI”. This is implicit, but of course OpenAI can only be talking about AGI systems which OpenAI can influence. I’ll touch briefly on this, below.

3: “Human” values and “human” intents. This is the big one. For example, I have to wonder, is there such a thing as human universal “values” and “intents”? Maybe, but if you were to think of the values you hold closest to your heart, closest to your sense of personal integrity, you can probably also think of humans that don’t share those values. The question absolutely needs to be asked: which humans’ values and intents are being considered? What coherent subset of humanity can we plug in, here, that would allow readers to be comfortable with the result?

There’s an idea I first encountered in George Orwell’s 1945 essay, You and the Atom Bomb:

It is a commonplace that the history of civilisation is largely the history of weapons. … [T]hough I have no doubt exceptions can be brought forward, I think the following rule would be found generally true: that ages in which the dominant weapon is expensive or difficult to make will tend to be ages of despotism, whereas when the dominant weapon is cheap and simple, the common people have a chance.

So, perhaps the development of the atom bomb didn’t augur a new age of despotism, I don’t want to argue the point. It’s certainly the case that high capital requirements and difficult construction constitute serious barriers to entry, and thus tend to benefit incumbents and discourage disruptors. AI research has had tremendous disruption, largely due to its recent low barrier to entry. But now, to reach the threshold of AGI, it has cost hundreds of millions of dollars to train state-of-the-art AI models, a cost which is out of reach of all but a handful of entities.

Are these, then, tools to concentrate power? Is there a research program to ensure that Mark Zuckerberg will act in alignment with human interests and values? That anyone would? On the other hand, these costs decline, probably according to Wright’s Law; and perhaps more importantly, the cost to run these models once they have been trained is orders of magnitude less than the cost to train them in the first place. Do the common people have a chance?

I’ll pull another quote from Orwell’s essay:

Some months ago, when the bomb was still only a rumour, there was a widespread belief that splitting the atom was merely a problem for the physicists, and that when they had solved it a new and devastating weapon would be within reach of almost everybody. (At any moment, so the rumour went, some lonely lunatic in a laboratory might blow civilisation to smithereens, as easily as touching off a firework.)

If AGI can be as powerful as it appears to be, would we want to live in a world where it’s controlled by a few, who put limits on how others are allowed to access the fruits of these inventions? Or is it better to live in a world where excessive intellectual power is available to essentially anyone? If AGI remains extremely expensive to build and operate, then we’ll probably end up in the former. If we make it cheap, over time, to develop and operate AGIs, then the consequences could be wondrous and catastrophic. (There’s more to say, here, but I’m trying to keep this post focused.)

Coming back to a point from before: OpenAI’s research program will only affect those AI models that OpenAI can itself influence. Their famous call for government regulation of their own business can be seen as an attempt to extend their influence over other AI developers. Which could have the consequence that others are inhibited from doing AI research, furthering OpenAI’s position in the market.

I am reasonably confident that the AI genie is out of the bottle, and that AGI is somewhere between “already here” and “inevitable in the near term”. Those who are explicitly researching AI alignment presumably have or are working towards their preferred operational definitions. The rest of us should be critical and skeptical of definitions coming from stakeholding actors.

Thursday, August 4, 2022

How does Rust async work?

This may be obvious to some folks, but it’s not been clear to me: how does Rust’s async / await facility actually work, down at the machine-code level? Why does the stack look the way it does when I backtrace? How will APIs like alloca interact with async code? This always nagged at me, so I’ve dug at it, and wrote this up to benefit those of you who’ve had similar questions. Going right in:

Calling `async` code

During compilation, an async function emits three identifiers to the LLVM layer:

A data structure defining the data operated on by the asynchronous code;
A private function for the asynchronous function body;
A public (within Rust’s code visibility model) function, that returns the initialized data for the Future. This is the function that gets directly invoked by the compiler.

The first and second items above form a closure, and are implemented using a (currently experimental) coroutine facility. If you consider this example code:

async fn foo() {}

Then the control flow for running foo’s asynchronous code with the world’s second-dumbest async runtime (which just spins on a single future in a loop until it’s ready)¹ will look similar to this:

struct FooGenerator { /* not shown */ }
fn foo() -> impl Future<Output = ()> {
    std::future::from_generator(
      std::future::from_generator(FooGenerator {}))
}
fn run_foo() {
    let mut cx = _; // shown later.
    // Obtain `foo`'s state structure as a "generator".
    let foo_generator = foo();
    let foo_result = loop {
        match unsafe { Pin::new_unchecked(&mut foo_future) }.poll(&mut cx) {
            Poll::Pending => (),
            Poll::Ready(val) => break val,
        }
    }
}

As I write this, the closure is a type of coroutine called a “generator”, which is translated to a future via the from_generator function. For an async function, the generator “yields” a meaningless value of () until the future is ready (i.e. until the async body has run to completion, and returned a value). A yield () indicates that the coroutine can’t make forward progress at present: it is how async functions implement blocking. (This allows the current thread to do something else – or even pause – until it’s time to poll the future again.) When you see a let y = x.await in the body of an async function, think of that as expanding into something like:

    let y = loop {
        match unsafe { Pin::new_unchecked(&mut x) }.poll(&mut cx) {
            Poll::Pending => {
                // Give up the CPU to our caller.
                yield ();
                // When the generator is resumed, it resumes here.
            }
            Poll::Ready(val) => break val,
        }
    };

Async Data

The data operated on by the generator is a Rust enum, in which each variant corresponds to a coroutine “yield”. For example, you can imagine that this code:

async fn bar(var1: usize) -> usize {
   let var2: &usize = &var1;
   let var3 = baz().await;
   *var2 + var3
}

You can imagine the data being stored as a Rust enum, something like this:

enum BarGenerator<'a> {
    Variant0 { var1: usize },
    Variant1 {
        var1: usize,
        var2: &'a usize,
        await_point: GenFuture<BazGenerator>, // BazGenerator not shown
    },
    Variant2 {
        var1: usize,
        var2: &'a usize,
        var3: usize,
    }
}

Generator `yield` expressions

I characterized the expansion of let val = future.await; above as replacing the await statement with a loop that polls the future, and issues a yield (); statement while the future is Pending. If we think of this function again:

async fn bar(var1: usize) -> usize {
   let var2: &usize = &var1;
   let var3 = baz().await;
   *var2 + var3
}

We now have the tools we need to create a functionally equivalent version in async-free Rust:

enum BarGenerator<'a> {
    // shown above
}
unsafe fn bar_closure(vars: Pin<&mut BarGenerator>, cx: &mut Context<'_>) ->
    GeneratorState<() /* yield type */, usize /* result type */>
{
    loop {
        match vars {
            Variant1 { ref var1 } => {
                let var2 = &var1;
                let await_point = baz();
                *vars = Variant2 { var1, var2, await_point };
            }
            Variant2 { ref var1, ref var2, ref await_point } => {
                match Pin::new_unchecked(&mut await_point).poll(&mut cx) {
                    Poll::Pending => return GeneratorState::Yielded(()),
                    Poll::Ready(var3) => {
                        *vars = Variant3 { var1, var2, var3 };
                    }
                }
            }
            Variant3 { ref var1, ref var2, ref var3 } => {
                return GeneratorState::Complete(*var2 + var3);
            }
        }
    }
}

fn bar(var1: usize) -> impl Future<Output = usize> {
    std::future::GenFuture::from_generator(
        BarGenerator::Variant0 { var1 })
}

(This isn’t exactly how async compilation works, but as I write this, it doesn’t seem too far off, and my goal here is more to give an intuition for how this works than to be perfectly accurate. I haven’t tried to compile / run the above.)

Commentary

Some of my exposure to async / await in garbage-collected languages mis-prepared me for async / await in Rust. In garbage-collected languages, allocating the data for a closure is a straightforward heap allocation, and garbage collection allows the closure data to be allocated only when the async function is called. In Rust, the storage for the entire async call graph is allocated before invoking the async code; the generator state will then live at a fixed location on the stack, beneath the top of the stack, so that invoking the async code uses additional stack to manage the call graph. (This is why async code cannot recurse without boxing: the compiler needs to be able to statically determine the maximum stack size for async code execution.)

In managed languages, a yield expression can record the execution context well enough that the resume operation doesn’t need to wind up the stack again, but can just start executing at the appropriate point. In Rust’s current async / await implementation, the call graph between the executor and the future being polled is walked again until the .await that caused the async function to block is reached. This means that .await becomes more expensive the deeper your async call graph nests.

Exposure to other system languages also made me wonder about alloca, and we can see now that Rust’s async model is fundamentally incompatible with alloca: you cannot .await in code adjacent to alloca-allocated variables, as the alloca-allocated memory will be torn down and probably corrupted as soon as the generator closure issues a yield.

I now feel I have a decent idea how async / await works in Rust, I hope this was helpful to readers, as well.

The world’s dumbest async runtime just ignores the future.↩︎

Saturday, December 11, 2021

Less Painful Linear Types

I’ve advocated for Linear Types in Rust for some time, most recently, arguing that it can resolve an important foot-gun for accidental future cancellation. Beyond that, I think these can be a useful tool for many other purposes, a few of which I mention in the RFC linked above. Rust caters to those who aim to write industrial-strength low-level software… Sometimes, in this area, resource finalization requires more care than simply letting a variable fall out of scope. Linear types address these cases.

That said, linear types can also cause some pain:

Linear types are naturally viral: once a field in a structure (or variant in an enum) is linear, the containing structure will, itself, become linear.
Linear types want API changes to generic containers. These will be annoying (at best) for the ecosystem to incorporate.
Linear types don’t interact well with panic-based unwinding.
Linear types don’t interact well with ? syntax.

Whether linear types are worth it for Rust depends on whether the benefits outweigh the costs. The perception in Rust seems to have been that the costs of linear types are too high. Here, I’d like to suggest that they can be lower than we thought.

Design Approach

This linear types proposal aims to address the pain points above head-on:

We provide an escape hatch, that allows structures that contain linear types to become affine again.
We define lints and automated code refactoring tools to make updating generic container types to support linear types less painful.
We only apply linear-types constraints to linear control flow. Drop can still be invoked during panic-based unwinding.
We define library facilities, to make ? interaction easier.

Proposal

ScopeDrop OIBIT

We define a new unsafe marker trait, called ScopeDrop. This is as it sounds: when a type implements ScopeDrop, then variables of the type are allowed to be cleaned up when they fall out of scope. (We do not affect panic-based clean-up with this trait: even if your type does not implement ScopeDrop, drop-glue can still be created to support panic-based unwinding.) Much like Send and Sync, this trait is auto-derived: if all fields of a structure or variants of an enum implement ScopeDrop, then the structure itself implements ScopeDrop.

We define a single marker type, PhantomLinear, which does not implement ScopeDrop. A user makes her type linear by including a PhantomLinear field, and this now virally infects all containers that might include her type.

So this code:

#[derive(Default)]
struct MyLinear {
    _linear_marker: std::marker::PhantomLinear,
    data: (),
}
fn oops() {
    let _oops: MyLinear = Default::default();
    // ^^^ generates a compilation failure, `_oops` is not allowed
    // to simply fall out of scope.
}

One is allowed to unsafe impl ScopeDrop on a type that would otherwise be linear. With the following code, the example above would compile successfully (though why you would want to write code like this is unclear to me):

unsafe impl ScopeDrop for MyLinear {}

It can make sense to impl Drop for a !ScopeDrop type. Generally, though, the only way that code would be invoked would be by unwinding the stack.

Affine Escape Hatch

Linear types are naturally viral, and limit available API surface area (that is, APIs that assume types are affine cannot work with variables of linear type, see here for details), so there’s a risk that a crate author will label a type as linear, in a way that makes it difficult for external users to consume the type. In this proposal, we define an escape hatch, that allows variables that do not implement ScopeDrop to be safely wrapped by variables that do. We can do this with a trait and a generic type:

// wrap a !ScopeDrop type into a ScopeDrop container.
struct ReScopeDrop<T: Consume>(ManuallyDrop<T>);
unsafe impl<T: Consume> ScopeDrop for ReScopeDrop<T> {}

impl<T: Consume> ReScopeDrop {
    pub fn new(value: T) -> Self {
        ReScopeDrop(value)
    }
    // private function, to support `Drop` implementation.
    unsafe fn take(&mut self) -> T {
        ManuallyDrop::<T>::take(&mut self.0)
    }
    pub fn into_inner(self) -> T {
        self.0
    }
}
// not shown: AsRef, AsMut, etc. for `ReScopeDrop`.

trait Consume {
    fn consume(self);
}
impl<T: Consume> Drop for ReScopeDrop<T>
{
    fn drop(&mut self) {
        unsafe { self.take() }.consume()
    }
}

With these additions, one can wrap an externally-defined !ScopeDrop type in such a way that ScopeDrop works again:

// externally defined type is linear.
struct ExternalLinear {
    _linear: PhantomLinear,
}
impl ExternalLinear {
    pub fn clean_up(self) {
        let ExternalLinear { _linear } = self;
        forget(_linear);
    }
}

// internally-defined type is affine.
struct AffineWrapperImpl(ExternalLinear);
type AffineWrapper = ReScopeDrop<AffineWrapperImpl>;

impl Consume for AffineWrapperImpl {
    fn consume(self) {
        self.0.clean_up();
    }
}

It’s a bit wordy to declare the wrapper type, which is unfortunate, but once this is done, it’s basically as easy to work with AffineWrapperImpl variables as it would be for any other affine variable. We have a reasonable mitigation for the viral aspect of linear types.

Early Return Helpers

Rust’s ? facility relies on early return, which facility – by intention – doesn’t interact well with linear types: any variable of linear type introduced before a ? will require some mechanism to allow early return, while not violating the type’s contract. I think we can handle these cases reasonably ergonomically by defining early return helpers:

struct CleanupImpl<F, T>
where:
    F: FnMut(T) -> ()
{
    var: T,
    cleanup: F,
}
// where clauses not repeated
impl<F, T> CleanupImpl<F, T> {
    fn new(var: T, cleanup: F) -> Self {
        CleanupImpl { var, cleanup }
    }
}
impl<F, T> Consume for CleanupImpl<F, T>
{
    fn consume(self) {
        let { var, cleanup } = self;
        cleanup(var);
    }
}
type Cleanup<F, T> = ReScopeDrop<CleanupImpl<F, T>>;

and maybe a macro to make constructing such a variable easy:

    let myvar = MyLinearType::new();
    let myvar = cleanup!(linear, move |linear| {
        // invoke linear-type specific clean-up function.
        myvar.cleanup();
    });

Updating the ecosystem

Linear types would be a large change to the language, with large implications for the standard library, and for large parts of the ecosystem. Making this as easy as possible, and as foolproof as possible is hugely important… I’d like to hear your thoughts about everything in this proposal, but especially this part.

To assist in ecosystem updates, we can define compiler or clippy lints to detect when code already satisfies linear type constraints, but is not written to understand linear types. Applying this lint to, say, the Option type would result in warnings on functions like map:

impl<T> Option<T> {
    fn map<U, F: FnOnce(T) -> U>(self, f: F) -> Option<U> {
        match self {
            Some(x) => Some(f(x)),
            None => None,
        }
    }
}

The lint detects that self is consumed, but the !ScopeDrop field won’t have drop glue generated in this function, so the function is linear-safe. The warning can be addressed by relaxing the T type to be ?ScopeDrop:

impl<T: ?ScopeDrop> Option<T> {
    fn map<U, F: FnOnce(T) -> U>(self, f: F) -> Option<U> ...
}

Or, perhaps we can add other syntax to make this less disruptive to the code structure:

impl<T> Option<T> {
    impl<T: ?ScopeDrop> fn map<U, F: FnOnce(T) -> U>(self, f: F) -> Option<U> {
        match self {
            Some(x) => Some(f(x)),
            None => None,
        }
    }
}

For this to be broadly consumed by the Rust community, we can make a tool (call it cargo-linearize) to automatically perform this refactoring.

Discussion

This proposal tries to make linear types support palatable to Rust in at least the following ways:

We emphasize an affine escape hatch.
We “punt” on the panic-unwind question.
We define a mechanism to simplify updating generic types to be linear-aware.

These changes should have the effect of making linear types easier to integrate into existing Rust code-bases. They do not eliminate the pain of linear types in Rust. On the other hand, to me, trying to eliminate the pain of linear types is like trying to eliminate the pain of the borrow-checker: if code correctness depends on the fact that resources can’t just be unintelligently dropped, then sometimes you’d rather have pain (compilation errors that can be annoying to placate) when you try to drop such a resource, than incorrect behavior.

When would such a facility be useful? Well, we’re talking about something like linear types quite a bit for async Rust, but I proposed something like this feature prior to Rust 1.0, well before async Rust was a thing. When I was an embedded systems developer, I used something akin to a “drop bomb” (i.e., a Drop implementation that causes a panic) to make sure that our resource ownership semantics were honored – these enforce linear type constraints at runtime, I would have preferred they be enforced at compile time. Browsing the set of issues that have linked to that postponed issue, others have regularly come reached a similar place. This is evidence that Rust’s core audience is interested in this feature. If we can keep the pain of linear types small enough, and this generally addresses important use cases, then perhaps it’s time to look seriously at linear types again?

Wednesday, December 1, 2021

Linear Types Can Help

There’s been a lot of discussion, recently, about how to improve async behavior in Rust. I can’t pretend to have internalized the entire discussion, but I will say that Linear Types feels like it should resolve several known foot-guns when attempting to support async Rust, while also being a general improvement to the language.

Bottom-line up front: Carl Lerche’s example (from https://carllerche.com/2021/06/17/six-ways-to-make-async-rust-easier/) of a surprisingly buggy async application looked like this:

async fn parse_line(socket: &TcpStream) -> Result<String, Error> {
    let len = socket.read_u32().await?;
    let mut line = vec![0; len];
    socket.read_exact(&mut line).await?;
    let line = str::from_utf8(line)?;
    Ok(line)
}

With linear types support, we could change async to async(!ScopeDrop) to make implicit Drop a compile-time failure, avoiding the bug; or perhaps a compiler flag could be used for the crate, to make futures !ScopeDrop by default, so that the exact same code could be run, without introducing an “accidental cancellation” foot-gun.

I’m planning to write three blogs about this, of which this is the first, where I try to indicate how this might work by going through the same examples from Carl Lerche’s great blog post on the subject, using an alternative linear-types approach for the solution. Next time, I’ll talk through the proposal; and then at the end, try to address expected objections. I’ve wanted some version of linear types in Rust for years; now with asynchronous Rust, they seem potentially more relevant than ever.

Does this meet requirements?

`select!`

Well, a linear-types future can’t be used with select!, since select! is defined to implicitly drop futures other than the first to complete. The language would push you to use a task, just as in Carl’s example.

`AsyncDrop`

Per Carl’s example, AsyncDrop is difficult in today’s Rust, because there isn’t an explicit .await point. I’d suggest a different AsyncCleanup trait, that reasonably works with linear types, to support behavior something like Python’s context managers:

trait AsyncDrop {
    async fn drop(&mut self);
}
fn with<T, F, O>(object: T, continuation: F) -> O
where
    T: AsyncDrop,
    F: Fn(&mut T) -> O
{
    let retval = continuation(&object);
    object.drop().await;
    retval
}

Then the bug that Carl pointed out here:

my_tcp_stream.read(&mut buf).await?;
async_drop(my_tcp_stream).await;

would be prevented at compile-time: the .await? on the first line would trigger a compilation failure. One would avoid this by using the context-manager approach:

with(my_tcp_stream, |my_tcp_stream| {
    my_tcp_stream.read(&mut buf).await?;
})?;

Get rid of `.await`

Not necessary with this change. I lean against removing .await, personally: my bias for systems languages is that I want to be able to predict the shape of the machine’s behavior from reading the source code, and getting rid of .await seems likely to make that harder, but I don’t really want to think about that further, others have other biases that are valid for them. More to my point here: linear types encourage a smaller change to the existing ecosystem than guaranteed-completion futures do.

Scoped tasks

Supported (I think) by having task::scope() return a linear type. On the other hand, I’m not yet comfortable with how executor runtimes handle unwinding from panics in unsafe code, so I’m likely missing something important.

Abort safety

I’m not sure this is desirable, at least at first. The #[abort_safe] lint introduces another “what color is your function” problem to the language. That said, if we did want this, we could define another trait, FutureAbort, as below:

trait FutureAbort {
    async fn abort(self);
}
impl<T: ScopeDrop> FutureAbort for T {
    // dropping a ScopeDrop future aborts it.
    async fn abort(self) {}
}

And revise items such as select! to abort() all un-completed futures. This can be made to prevent abort-safe functions from calling non-abort-safe functions relatively easily:

// because the returned future isn't ScopeDrop, it won't
// be abort_safe by default.
async(!ScopeDrop) fn foo { /* not shown */ }
#[abort_safe]
async fn bar() {
    foo().await // compiler error: abort_safe futures cannot await
                // non-abort_safe futures.
}

The default behavior will still be abort-safe, but users are allowed to opt-in to behavior where abort safety isn’t wanted.

I think this covers the bulk of Carl’s uses, and therefore suggests . Linear types are not really an async-Rust feature, but they do (in my opinion) apply nicely, here. The shape of my current thinking about how these can work is more-or-less inferrable from the above, but I wanted to keep this post relatively short, so I’ll save the actual proposal for next time.

Thanks for reading!

Friday, March 31, 2017

Documenting With Types

I've said this before: elegant code is pedagogical. That is, elegant code is designed to teach its readers about the concepts and relationships in the problem domain that the code addresses, with as little noise as possible. I think types are a fundamental tool for teaching readers about the domain that code addresses, but heavy use of types tends to introduce noise into the code base. Still, types are often misunderstood, and, as such, tend to be under-utilized, or otherwise misused.

Types are means of describing to the compiler what operations are valid against data. Different programming languages will have their own concept of data type and their own means of defining them. However, the type system in every language will have these two responsibilities: to allow the programmer to define what operations are valid against typed data, and to indicate failure when the source code contains a prohibited operation. (Of course, type definitions in most languages do more than this, such as also defining memory organization for typed data. These details are critical for translating source code into an executable form, but are not critical to the type-checker.)

The strength of a programming language's type system is related to how much it helps or hinders your efforts to discourage client error. It's my general thesis that the most important kind of help is when your types correspond to distinct domain concepts, simultaneously teaching your users about these concepts and their possible interactions, and discouraging the creation of nonsensical constructs. But this doesn't usually come for free. I won't go further into this here, but some hindrances are:

Error messages that don't help users understand their mistakes.
Excessively verbose type signatures.
Difficulty of representing correctness constraints.
More abstraction than your users will want to process.

Your types should pay for themselves, by some combination of keeping these costs low, of preventing catastrophic failure, or of being an effective teacher of your domain concepts.¹

How does one document code with types? I have a few guidelines, but this blog will run long if I go into all of them here, so I'll start with one:

Primitive types are rarely domain types.

Consider integers. A domain concept may be stored in an integer, but this integer will usually have some unit bound to it, like "10 seconds", or "148 bytes". It almost certainly does not make domain sense to treat a value you got back from time(2) like a value you got back from ftell(3), so these could be different types, and the type could be used to prevent misuse. Depending on language, I might even do so, but consider the options:

In C, you can use typedef to create a type alias, as POSIX does. This serves to document that different integers may be used differently, but does not actually prevent mis-use:

#include <inttypes.h>
typedef uint64_t my_offset_t;
typedef int64_t my_time_t;

void ex1() {
  my_offset_t a = 0;
  /* variable misuse, but not an error in C's type system. */
  my_time_t b = a;
}

You could use a struct to create a new type, but these are awkward to work with:

#include <inttypes.h>
#include <stdio.h>
typedef struct { uint64_t val; } my_offset_t;
typedef struct { int64_t val; } my_time_t;

void my_func() {
  my_offset_t offset_a = { .val=0 };
  my_offset_t offset_b = { .val=1 };
  my_time_t time_c = { .val=2 };

  /*
   * variable misuse is a compile-time error:
   *   time_c = offset_a;
   * does not compile.
   */

  /*
   * cannot directly use integer operations:
   *   if (offset_b > offset_a) { printf("offset_b > offset_a\n"); }
   * does not compile, but can use:
   */

  if (offset_b.val > offset_a.val) { printf("offset_b.val > offset_a.val\n"); }

  /*
   * but the technique of reaching directly into the structure to use
   * integer operations also allows:
   */

  if (time_c.val > offset_a.val) { printf("BAD\n"); }

  /*
   * which is a domain type error, but not a language type error
   * (though it may generate a warning for a signed / unsigned
   * comparison). One could define a suite of functions against the
   * new types, such as:
   *   int64_t compare_offsets(restrict my_offset_t *a, restrict my_offset_t *b)
   *   {
   *     return (int64_t) a->val - (int64_t) b->val;
   *   }
   * and then one could use the more type-safe code:
   *   if (compare_offsets(&offset_a, &offset_b) > 0) {
   *     printf("GOOD\n");
   *   }
   * but, in no particular order: this isn't idiomatic, so it's more
   * confusing to new maintainers; even once you're used to the new
   * functions, it's not as readable as idiomatic code; depending on
   * optimization and inlining, it's plausibly less efficient than
   * idiomatic code; and it is awkward and annoying to define new
   * functions to replace the built-in integer operations we'd like
   * to use.
   */
}

As far as I can tell, C does not provide ergonomic options for using the type system to prevent integer type confusion. That said, the likelihood of user error in this example (misusing a time as a file size, or vice versa) is pretty low, so I would probably make the same choice that POSIX did in this circumstance, and just use type aliases to document that the types are different, and give up on actually preventing mis-use.

On the other hand, at Circonus we maintain a time-series database that must deal with time at multiple resolutions, each represented as 64-bit integers: Aggregate data storage uses units of seconds-since-unix-epoch, while high-resolution data storage uses units of milliseconds-since-unix-epoch. In this case, the likelihood of user error working with these different views of time is very high (we have a number of places where we need to convert between these views, and have even needed to change some code from using time-in-seconds to using time-in-milliseconds). Furthermore, making mistakes would probably result in presenting the wrong data to the user (not something you want in a database), or possibly worse.

If we were strictly using C, I would probably want to follow the approach Joel Spolsky outlined here, and use a form of Hungarian notation to represent the different views of time. As it happens, we're using C++ in this part of the code base, so we can use the type system to enforce correctness. We have an internal proscription against using the STL (to keep our deployed code relatively trace-able with, say, dtrace), so std::chrono is out. But we can define our own types for working with these different views of time. We start by creating our own [strong_typedef][strong_typedef] facility (no, we don't use BOOST, either):

#define ALWAYS_INLINE __attribute__((always_inline))

// bare-bones newtype facility, intended to wrap primitive types (like
// `int` or `char *`), imposes no run-time overhead.
template <typename oldtype, typename uniquify>
  class primitive_newtype
{
public:
  typedef oldtype oldtype_t;
  typedef primitive_newtype<oldtype, uniquify> self_t;

  primitive_newtype(oldtype val) : m_val(val) {}
  ALWAYS_INLINE oldtype_t to_oldtype() { return m_val; }
private:
  oldtype m_val;
};

With this facility, we can define domain types that are incompatible, but which share a representation, and which should impose no overhead over using the primitive types:

class _uniquify_s_t;
typedef primitive_newtype<int64_t, _uniquify_s_t> seconds_t;
class _uniquify_ms_t;
typedef primitive_newtype<int64_t, _uniquify_ms_t> milliseconds_t;

Or, better, since time types all share similar operations, we can define the types together with their operations, and also split up the types for "time point" from "time duration", while enforcing a constraint that you can't add two time points together:

template <class uniquify>
  class my_time_t
{
private:
  class _uniquify_point_t;
  class _uniquify_diff_t;
public:
  typedef primitive_newtype<int64_t, _uniquify_point_t> point_t;
  typedef primitive_newtype<int64_t, _uniquify_diff_t> diff_t;
  static ALWAYS_INLINE point_t add(point_t a, diff_t b)
  {
    return point_t(a.to_oldtype() + b.to_oldtype());
  }
  static ALWAYS_INLINE diff_t diff(point_t a, point_t b)
  {
    return diff_t(a.to_oldtype() - b.to_oldtype());
  }
  static ALWAYS_INLINE point_t diff(point_t a, diff_t b)
  {
    return point_t(a.to_oldtype() - b.to_oldtype());
  }
  static ALWAYS_INLINE diff_t diff(diff_t a, diff_t b)
  {
    return diff_t(a.to_oldtype() - b.to_oldtype());
  }
  static ALWAYS_INLINE diff_t add(diff_t a, diff_t b)
  {
    return diff_t(a.to_oldtype() + b.to_oldtype());
  }
};

class _millisecond_uniquify_t;
typedef my_time_t<_millisecond_uniquify_t> my_millisecond_t;
class _second_uniquify_t;
typedef my_time_t<_second_uniquify_t> my_second_t;

This is just the primitive basis of our time-management types, and (to help the example fit in a blog post, and because I write the blog to a different audience than I write production code) is implemented a little differently than what we actually have in our code base. With these new types, we can perform basic operations with time in seconds or milliseconds units, while preventing incorrect mixing of types: an attempt to take a difference between a time-point based in seconds and a time-point based in milliseconds, for example, will result in a compilation error. Using these facilities made translating one of our HTTP endpoints from operating against seconds to operating against milliseconds into an entirely mechanical process of converting one code location to use the new types, starting a build, getting a compilation error from a seconds / milliseconds mismatch, changing that location to use the new types, and repeating. This process was much less likely to result in errors than it would have been had we been using bare int64_t's everywhere, relying on code audit to try and ensure that everything that worked with the old units was correctly updated to use the new. These types are more annoying to work with than bare integers, but using them helped avoid introducing a very likely and very significant system problem under maintenance. The types paid for themselves.

(Thanks to Riley Berton for reviewing this post. A form of this post also appeared on the Circonus blog, here.)

C++ is an interesting case of weighing costs and benefits, in that, while the benefits of using advanced C++ type facilities can be very high (bugs in C++ code can be catastrophic, and many of C++'s advanced facilities impose no runtime overhead), the maintenance costs can also be extremely high, especially when using advanced type facilities. I've seen thousands of characters of errors output due to a missing const in a type signature. This can be, ahem, intimidating.↩

Wednesday, December 7, 2016

Correctness and Maintainability

In my career, I've focused a lot on making my code bug-resistant, and I've spent a lot of time trying to learn or invent techniques that would systematically prevent the types of bugs I would run into the most frequently. I can think of three basic types of approach I've taken, which can be placed on a spectrum from more formal to less:

Exploiting type systems;
Defining code-based tests;
Simplifying reviewability or debugability.

The three approaches can be characterized based on what resources we use to we check that the code is correct. You can:

Use the compiler to verify correctness (via type-checking);
Use code to verify correctness (via testing assertions);
Use people to verify correctness (via code review or debugging).

And then you might observe how relatively likely it is that the verification missed a program bug:

The compiler will never pass a program that contains a type-checking error, though it's possible that the type-system is unsound (in which case a bug the type-system is supposed to prevent may still occur). In mature languages, such issues are either rare or well understood, so we can generally say that type-checking provides rigorous proof of program correctness -- inasmuch as program correctness is assessable by the type-system, but more on this later.
Testing will always detect code that fails the tests, but to be robust, tests must cover the entire interesting semantic range of the Unit Under Test, and this range can be extremely difficult to characterize. Designing for testability is largely about reducing the semantic range of our program units (such that meaningful tests can be written more easily), but even given that, it is difficult to determine the full semantic range of a system unit, and tests will never be absolute. Moreover, even if full semantic coverage were possible to determine, there will still be parts of all "interesting" systems where code-based tests are too difficult to write, too expensive to run, or where correctness is too difficult to determine (or otherwise too uncertain), to make testing worthwhile.
And, finally, when it comes to review by people, I'm not sure anything can be said with certainty about what bugs might be missed. Usability testing may require unvarnished input from neophytes, your professor may use rigorous methods to prove your program correct, you should try to know your audience.

And, by implication from the above, one can look at those three types of approach and determine how likely it is that the approach will prevent bugs during code maintenance:

The type system systematically prevents the introduction of type errors into the program.
Tests prevent bug regressions during maintenance, and following a test-oriented implementation methodology can strongly discourage the introduction of bugs into the system.
Techniques for improving readability weakly discourage the introduction of bugs into the system.

So, we should prefer to use the type system, and failing that use tests, and failing that just try to make our code clearer, right? Not exactly.

First of all, these three approaches are not exclusive: one can make advanced use of type systems, and write test code, and improve human reviewability, all at once. Indeed, using the type system, or writing unit tests can improve a human reader's ability to build a mental model of the system under analysis, improving solution elegance.

Secondly, in my experience, the human maintainer is always the most important consideration in software design. If your tests or your data types aren't maintainable, then they are technical debt of one form or another.¹

And finally, different people will have different capabilities and preferences, and more robust approaches may be less maintainable in your developer community. OK, so maybe (and I'm not sure that this is true, but it seems plausible) Idris can codify the invariants required by a Red-Black Tree at the type-level, so that compilation would fail if the implementation contains bugs. This sounds really cool, and it makes me want to learn Idris, but Idris's type system is simply more difficult to learn than C#'s, or even Haskell's. If you stick to the "safe" subset of Rust, your code should not have use-after-free bugs or buffer overruns, but people that learn Rust invariably describe their early efforts as, essentially, fighting the borrow-checker. Even writing code for testability is an acquired skill, which takes a significant investment to learn, and your market window may not afford you the time to invest.

This is to say: for the most part, trying to improve correctness will also result in a more maintainable system. But not always. And the more advanced techniques you use in exploiting the type system, or even in designing your tests, the fewer developers will be able to work with your code. In the extreme case, you may get code that is proven correct by the compiler, but that is unmaintainable by humans. Know your audience, and be judicious in the techniques you apply.

I use the term "technical debt" in the sense of something that will require extra work, later, to maintain. By this definition, paying down technical debt does not necessarily require code changes: the payment may be the time and effort taken training developers to work with the code in its existing form.↩

Wednesday, December 10, 2014

Bootstrapping Rust

There are two stories I particularly like about the problem of bootstrapping self-referential tools. This one:

Another way to show that LISP was neater than Turing machines was to write a universal LISP function and show that it is briefer and more comprehensible than the description of a universal Turing machine. This was the LISP function eval[e,a], which computes the value of a LISP expression e - the second argument a being a list of assignments of values to variables. (a is needed to make the recursion work). Writing eval required inventing a notation representing LISP functions as LISP data, and such a notation was devised for the purposes of the paper with no thought that it would be used to express LISP programs in practice. Logical completeness required that the notation used to express functions used as functional arguments be extended to provide for recursive functions, and the LABEL notation was invented by Nathaniel Rochester for that purpose. D.M.R. Park pointed out that LABEL was logically unnecessary since the result could be achieved using only LAMBDA - by a construction analogous to Church's Y-operator, albeit in a more complicated way.

S.R. Russell noticed that eval could serve as an interpreter for LISP, promptly hand coded it, and we now had a programming language with an interpreter.

And this one:

Then, one day, a student who had been left to sweep up after a particularly unsuccessful party found himself reasoning in this way: If, he thought to himself, such a machine is a virtual impossibility, it must have finite improbability. So all I have to do in order to make one is to work out how exactly improbable it is, feed that figure into the finite improbability generator, give it a fresh cup of really hot tea... and turn it on!

Rust is, I think, the programming language I've been waiting for: it takes a large proportion of the ideas I like from functional languages, applies them in a systems programming language context with 0-overhead abstractions, can run without a supporting runtime, and makes "ownership" and "lifetime" into first-class concepts. In my work trying to build highly complex and robust embedded systems, these are exactly the features I've been waiting for, and I'm thrilled to see a modern programming language take this application space seriously.

Building the language on a second-class platform, though, was intensely frustrating. The frustration came from several places, including that my Unix systems-administration skills have gotten (*ahem*) rusty, but I'd say the bulk of my difficulty came from a single factor: circular dependencies in the tooling.

Building the Rust compiler requires a working Rust compiler.
Building the Cargo package-manager requires a working Cargo package-manager.

These problems turned out to be ultimately surmountable: The rustc compiler had been previously built for a different version of FreeBSD, but once I re-compiled the required libraries on my version, I managed to get the pre-built compiler running. For the Cargo package manager, I ended up writing a hacky version of Cargo in Ruby, and using that to bootstrap the "true" build. I'm glossing over it here, but this turned out to be a lot of effort to create: Rust is evolving rapidly, and it was difficult to track new versions of Cargo (and its dependencies) that depended on new versions of the Rust compiler, while the Rust compiler itself sometimes did not work on my platform.

This was, necessarily, my first exposure to the platform in general, and it was, unfortunately, not a positive experience for me. Circular dependencies in developer tools are completely understandable, but there should be a way to break the cycle. Making bootstrap a first-order concern does not seem that difficult to me, in these cases, and would greatly enhance the portability of the tools (maybe even getting Rust to work on NetBSD, on GNU/Hurd, or on other neglected platforms):

The dependency of the compiler on itself can be broken by distributing the bootstrap compiler as LLVM IR code. Then use LLVM's IR assembler, and other tools, to re-link the stage0 Rust compiler on the target platform. The stage0 compiler could then be used to build the stage1 compiler, and the rest of the build could proceed as it does today. (Rust issue #19706)
The dependency of the package manager on itself can be broken by adding a package manager target to, say, tar up the source code of the package manager and its dependencies, along with a shell script of build commands necessary to link all that source code together to create the package manager executable. (Cargo issue #1026)

I am not suggesting that we should avoid circular dependencies in developer tools: Eating our own dogfood is an extremely important way in which our tools improve over time, and tools like this should be capable of self-hosting. But the more difficult it is to disentangle the circular dependency that results, the more difficult it becomes to bootstrap our tools, which will ultimately mean less people adopting our tools in new environments. Most of us aren't building Infinite Improbability Drives. Bootstrapping should be a first-order concern.