Saturday, December 11, 2021

Less Painful Linear Types

I’ve advocated for Linear Types in Rust for some time, most recently, arguing that it can resolve an important foot-gun for accidental future cancellation. Beyond that, I think these can be a useful tool for many other purposes, a few of which I mention in the RFC linked above. Rust caters to those who aim to write industrial-strength low-level software… Sometimes, in this area, resource finalization requires more care than simply letting a variable fall out of scope. Linear types address these cases.

That said, linear types can also cause some pain:

  1. Linear types are naturally viral: once a field in a structure (or variant in an enum) is linear, the containing structure will, itself, become linear.
  2. Linear types want API changes to generic containers. These will be annoying (at best) for the ecosystem to incorporate.
  3. Linear types don’t interact well with panic-based unwinding.
  4. Linear types don’t interact well with ? syntax.

Whether linear types are worth it for Rust depends on whether the benefits outweigh the costs. The perception in Rust seems to have been that the costs of linear types are too high. Here, I’d like to suggest that they can be lower than we thought.

Design Approach

This linear types proposal aims to address the pain points above head-on:

  1. We provide an escape hatch, that allows structures that contain linear types to become affine again.
  2. We define lints and automated code refactoring tools to make updating generic container types to support linear types less painful.
  3. We only apply linear-types constraints to linear control flow. Drop can still be invoked during panic-based unwinding.
  4. We define library facilities, to make ? interaction easier.

Proposal

ScopeDrop OIBIT

We define a new unsafe marker trait, called ScopeDrop. This is as it sounds: when a type implements ScopeDrop, then variables of the type are allowed to be cleaned up when they fall out of scope. (We do not affect panic-based clean-up with this trait: even if your type does not implement ScopeDrop, drop-glue can still be created to support panic-based unwinding.) Much like Send and Sync, this trait is auto-derived: if all fields of a structure or variants of an enum implement ScopeDrop, then the structure itself implements ScopeDrop.

We define a single marker type, PhantomLinear, which does not implement ScopeDrop. A user makes her type linear by including a PhantomLinear field, and this now virally infects all containers that might include her type.

So this code:

#[derive(Default)]
struct MyLinear {
    _linear_marker: std::marker::PhantomLinear,
    data: (),
}
fn oops() {
    let _oops: MyLinear = Default::default();
    // ^^^ generates a compilation failure, `_oops` is not allowed
    // to simply fall out of scope.
}

One is allowed to unsafe impl ScopeDrop on a type that would otherwise be linear. With the following code, the example above would compile successfully (though why you would want to write code like this is unclear to me):

unsafe impl ScopeDrop for MyLinear {}

It can make sense to impl Drop for a !ScopeDrop type. Generally, though, the only way that code would be invoked would be by unwinding the stack.

Affine Escape Hatch

Linear types are naturally viral, and limit available API surface area (that is, APIs that assume types are affine cannot work with variables of linear type, see here for details), so there’s a risk that a crate author will label a type as linear, in a way that makes it difficult for external users to consume the type. In this proposal, we define an escape hatch, that allows variables that do not implement ScopeDrop to be safely wrapped by variables that do. We can do this with a trait and a generic type:

// wrap a !ScopeDrop type into a ScopeDrop container.
struct ReScopeDrop<T: Consume>(ManuallyDrop<T>);
unsafe impl<T: Consume> ScopeDrop for ReScopeDrop<T> {}

impl<T: Consume> ReScopeDrop {
    pub fn new(value: T) -> Self {
        ReScopeDrop(value)
    }
    // private function, to support `Drop` implementation.
    unsafe fn take(&mut self) -> T {
        ManuallyDrop::<T>::take(&mut self.0)
    }
    pub fn into_inner(self) -> T {
        self.0
    }
}
// not shown: AsRef, AsMut, etc. for `ReScopeDrop`.

trait Consume {
    fn consume(self);
}
impl<T: Consume> Drop for ReScopeDrop<T>
{
    fn drop(&mut self) {
        unsafe { self.take() }.consume()
    }
}

With these additions, one can wrap an externally-defined !ScopeDrop type in such a way that ScopeDrop works again:

// externally defined type is linear.
struct ExternalLinear {
    _linear: PhantomLinear,
}
impl ExternalLinear {
    pub fn clean_up(self) {
        let ExternalLinear { _linear } = self;
        forget(_linear);
    }
}

// internally-defined type is affine.
struct AffineWrapperImpl(ExternalLinear);
type AffineWrapper = ReScopeDrop<AffineWrapperImpl>;

impl Consume for AffineWrapperImpl {
    fn consume(self) {
        self.0.clean_up();
    }
}

It’s a bit wordy to declare the wrapper type, which is unfortunate, but once this is done, it’s basically as easy to work with AffineWrapperImpl variables as it would be for any other affine variable. We have a reasonable mitigation for the viral aspect of linear types.

Early Return Helpers

Rust’s ? facility relies on early return, which facility – by intention – doesn’t interact well with linear types: any variable of linear type introduced before a ? will require some mechanism to allow early return, while not violating the type’s contract. I think we can handle these cases reasonably ergonomically by defining early return helpers:

struct CleanupImpl<F, T>
where:
    F: FnMut(T) -> ()
{
    var: T,
    cleanup: F,
}
// where clauses not repeated
impl<F, T> CleanupImpl<F, T> {
    fn new(var: T, cleanup: F) -> Self {
        CleanupImpl { var, cleanup }
    }
}
impl<F, T> Consume for CleanupImpl<F, T>
{
    fn consume(self) {
        let { var, cleanup } = self;
        cleanup(var);
    }
}
type Cleanup<F, T> = ReScopeDrop<CleanupImpl<F, T>>;

and maybe a macro to make constructing such a variable easy:

    let myvar = MyLinearType::new();
    let myvar = cleanup!(linear, move |linear| {
        // invoke linear-type specific clean-up function.
        myvar.cleanup();
    });

Updating the ecosystem

Linear types would be a large change to the language, with large implications for the standard library, and for large parts of the ecosystem. Making this as easy as possible, and as foolproof as possible is hugely important… I’d like to hear your thoughts about everything in this proposal, but especially this part.

To assist in ecosystem updates, we can define compiler or clippy lints to detect when code already satisfies linear type constraints, but is not written to understand linear types. Applying this lint to, say, the Option type would result in warnings on functions like map:

impl<T> Option<T> {
    fn map<U, F: FnOnce(T) -> U>(self, f: F) -> Option<U> {
        match self {
            Some(x) => Some(f(x)),
            None => None,
        }
    }
}

The lint detects that self is consumed, but the !ScopeDrop field won’t have drop glue generated in this function, so the function is linear-safe. The warning can be addressed by relaxing the T type to be ?ScopeDrop:

impl<T: ?ScopeDrop> Option<T> {
    fn map<U, F: FnOnce(T) -> U>(self, f: F) -> Option<U> ...
}

Or, perhaps we can add other syntax to make this less disruptive to the code structure:

impl<T> Option<T> {
    impl<T: ?ScopeDrop> fn map<U, F: FnOnce(T) -> U>(self, f: F) -> Option<U> {
        match self {
            Some(x) => Some(f(x)),
            None => None,
        }
    }
}

For this to be broadly consumed by the Rust community, we can make a tool (call it cargo-linearize) to automatically perform this refactoring.

Discussion

This proposal tries to make linear types support palatable to Rust in at least the following ways:

  1. We emphasize an affine escape hatch.
  2. We “punt” on the panic-unwind question.
  3. We define a mechanism to simplify updating generic types to be linear-aware.

These changes should have the effect of making linear types easier to integrate into existing Rust code-bases. They do not eliminate the pain of linear types in Rust. On the other hand, to me, trying to eliminate the pain of linear types is like trying to eliminate the pain of the borrow-checker: if code correctness depends on the fact that resources can’t just be unintelligently dropped, then sometimes you’d rather have pain (compilation errors that can be annoying to placate) when you try to drop such a resource, than incorrect behavior.

When would such a facility be useful? Well, we’re talking about something like linear types quite a bit for async Rust, but I proposed something like this feature prior to Rust 1.0, well before async Rust was a thing. When I was an embedded systems developer, I used something akin to a “drop bomb” (i.e., a Drop implementation that causes a panic) to make sure that our resource ownership semantics were honored – these enforce linear type constraints at runtime, I would have preferred they be enforced at compile time. Browsing the set of issues that have linked to that postponed issue, others have regularly come reached a similar place. This is evidence that Rust’s core audience is interested in this feature. If we can keep the pain of linear types small enough, and this generally addresses important use cases, then perhaps it’s time to look seriously at linear types again?

Wednesday, December 1, 2021

Linear Types Can Help

There’s been a lot of discussion, recently, about how to improve async behavior in Rust. I can’t pretend to have internalized the entire discussion, but I will say that Linear Types feels like it should resolve several known foot-guns when attempting to support async Rust, while also being a general improvement to the language.

Bottom-line up front: Carl Lerche’s example (from https://carllerche.com/2021/06/17/six-ways-to-make-async-rust-easier/) of a surprisingly buggy async application looked like this:

async fn parse_line(socket: &TcpStream) -> Result<String, Error> {
    let len = socket.read_u32().await?;
    let mut line = vec![0; len];
    socket.read_exact(&mut line).await?;
    let line = str::from_utf8(line)?;
    Ok(line)
}

With linear types support, we could change async to async(!ScopeDrop) to make implicit Drop a compile-time failure, avoiding the bug; or perhaps a compiler flag could be used for the crate, to make futures !ScopeDrop by default, so that the exact same code could be run, without introducing an “accidental cancellation” foot-gun.

I’m planning to write three blogs about this, of which this is the first, where I try to indicate how this might work by going through the same examples from Carl Lerche’s great blog post on the subject, using an alternative linear-types approach for the solution. Next time, I’ll talk through the proposal; and then at the end, try to address expected objections. I’ve wanted some version of linear types in Rust for years; now with asynchronous Rust, they seem potentially more relevant than ever.

Does this meet requirements?

select!

Well, a linear-types future can’t be used with select!, since select! is defined to implicitly drop futures other than the first to complete. The language would push you to use a task, just as in Carl’s example.

AsyncDrop

Per Carl’s example, AsyncDrop is difficult in today’s Rust, because there isn’t an explicit .await point. I’d suggest a different AsyncCleanup trait, that reasonably works with linear types, to support behavior something like Python’s context managers:

trait AsyncDrop {
    async fn drop(&mut self);
}
fn with<T, F, O>(object: T, continuation: F) -> O
where
    T: AsyncDrop,
    F: Fn(&mut T) -> O
{
    let retval = continuation(&object);
    object.drop().await;
    retval
}

Then the bug that Carl pointed out here:

my_tcp_stream.read(&mut buf).await?;
async_drop(my_tcp_stream).await;

would be prevented at compile-time: the .await? on the first line would trigger a compilation failure. One would avoid this by using the context-manager approach:

with(my_tcp_stream, |my_tcp_stream| {
    my_tcp_stream.read(&mut buf).await?;
})?;

Get rid of .await

Not necessary with this change. I lean against removing .await, personally: my bias for systems languages is that I want to be able to predict the shape of the machine’s behavior from reading the source code, and getting rid of .await seems likely to make that harder, but I don’t really want to think about that further, others have other biases that are valid for them. More to my point here: linear types encourage a smaller change to the existing ecosystem than guaranteed-completion futures do.

Scoped tasks

Supported (I think) by having task::scope() return a linear type. On the other hand, I’m not yet comfortable with how executor runtimes handle unwinding from panics in unsafe code, so I’m likely missing something important.

Abort safety

I’m not sure this is desirable, at least at first. The #[abort_safe] lint introduces another “what color is your function” problem to the language. That said, if we did want this, we could define another trait, FutureAbort, as below:

trait FutureAbort {
    async fn abort(self);
}
impl<T: ScopeDrop> FutureAbort for T {
    // dropping a ScopeDrop future aborts it.
    async fn abort(self) {}
}

And revise items such as select! to abort() all un-completed futures. This can be made to prevent abort-safe functions from calling non-abort-safe functions relatively easily:

// because the returned future isn't ScopeDrop, it won't
// be abort_safe by default.
async(!ScopeDrop) fn foo { /* not shown */ }
#[abort_safe]
async fn bar() {
    foo().await // compiler error: abort_safe futures cannot await
                // non-abort_safe futures.
}

The default behavior will still be abort-safe, but users are allowed to opt-in to behavior where abort safety isn’t wanted.

I think this covers the bulk of Carl’s uses, and therefore suggests . Linear types are not really an async-Rust feature, but they do (in my opinion) apply nicely, here. The shape of my current thinking about how these can work is more-or-less inferrable from the above, but I wanted to keep this post relatively short, so I’ll save the actual proposal for next time.

Thanks for reading!