Baby Steps

A blog about programming, and tiny ways to improve it.

Vectors, Slices, and Functions, Oh My!

I wanted to bring together the various ideas around vectors and function types into one post. The goals of these changes are

  1. to achieve orthogonality of the pointer types, so that leading &, @, and ~ sigils are the only way to indicate the kind of pointer that is in use;
  2. to help pare down on the proliferation of subtle variantions on types, such as the 5 different function types currently available.

The proposal

The Rust type system would be described by the following grammar. In this grammar, I have included all optional portions except for region bounds. I indicated those types which could have a lifetime bound associated with them by writing (/&r) in the description (a lifetime bound indicates the lifetime of any pointers embedded within the type itself; this is not related to the changes I am discussing here so I won’t go into detail):

M = mut | const | ""                  // immutable by default

K = send | copy | move | ""           // move by default

T = () | int | uint | float | ...     // scalar types
  | @MT | ~MT | &r.MT | *MT           // pointer types
  | id<T*>                            // enum, resource, class (/&r)
  | id                                // type variable
  | [T]                               // slice type (/&r)
  | substr                            // string slice type (/&r)
  | (T*)                              // tuple type
  | (T "*" N)                         // fixed-size array
  | {(M id: T)*}                      // anonymous record types
  | U                                 // dynamically sized types

U = fn:K(T*) -> T                     // closure types (/&r)
  | id:K<T*>                          // iface instances (/&r)
  | vec<MT>                           // vector type
  | str                               // string type

Dynamically sized types

The types described by U are separated out because, unlike the other types listed, they have “dynamic size”—that is, the size of an instance of U will vary from instance to instance. As a result, the U types are somewhat “second-class” when compared to the other types:

  • Type variables cannot be bound to dynamically sized types.
  • Expressions whose type is a dynamically sized type are generally prohibited.

There is one exception to these rules. Literal expressions of dynamically sized types are permitted, as the compiler can readily compute their size. The literal forms of the various types U are:

Type            Literal form
----            ------------
fn:K(T*) -> T   fn:K(x, y) -> T { ... }
fn:K(T*) -> T   id (where id is a fn item)
id:K<T*>        iface(v)
vec<MT>         [M ...]
str             "..."

If it seems useful, we could lift the restriction that type variables cannot be bound to dynamically sized types and instead use some sort of kind to mark variables that may accept dynamically sized types (or to mark those that may not, depending on what we feel the defaults ought to be).

Vectors and slices

These basically work in the same way as currently proposed, but the syntax has changed. A vector is written vec<T>; note that unlike other types, vector type parameters have a mutability, so you might have vec<mut u8>, for example.

Slices can be created by using the [:] operator, which works just as in Python. So [1, 2, 3][-1:] returns a slice containing [3], [1, 2, 3][1:-1] returns [2] and [1, 2, 3][2:1] returns an empty slice. The slice operator can be applied to both vectors and slices. We could conceivably allow it to be overloaded.

Vectors may be added to slices. The type of the resulting vector is taken from the left-hand side. So adding a @vec<mut u8> to a [u8] yields a (longer) @vec<mut u8>.

Fixed-length arrays

The type (T * N) represents a fixed-length array. Here T is another type and N is a constant expression. This is primarily intended for C compatibility: a fixed-length array has no length field and is simply represented by N instances of the type T laid out one after the other. In most ways it is precisely equivalent to a tuple. There is no literal form for such arrays: they are in fact supertypes of tuples of equivalent size, and so share the tuple syntax (v1, ..., vN). We can introduce a macro for repeating the same element N times to avoid repetition.

Fixed-length arrays are indexable and sliceable but their contents are not modifiable. If modification is desired one can create a simple one-entry record.

Note: I think this idea of having fixed-length arrays and tuples be closely related makes sense. I’m mostly trying to keep things simple and not introduce too much machinery for an edge case. But maybe there is a problem with it.

Closure and iface instance bounds

Both the closure and iface instance types feature a bound K called a “kind bound”. These types are unlike the other types because they are “opaque” to the compiler: that is, the compiler does not know the types of the data that is contained within. The bound K puts a restriction on those types so that additional operations can be permitted.

For example, if you have a closure x of type fn:send(), then the compiler knows that whatever data is closed over by x is sendable. The compiler can therefore permit x to be sent between tasks. Similarly the iface type to_str:send describes an instance of some type which is sendable and implements the to_str interface.

If no bound is specified, the default is move (which is the most general). This simply states that the closure may close over arbitrary data.

As today, the “sugared closure” form {|x| ...} would be inferred to some form of “pointer to closure”. That is, it could result in @fn, ~fn or &fn, depending on the expected type.

There is no type that represents a “closure that accesses variables by reference and not by copy”. Sugared closures become the only way to construct such a closure: if the expected type is &fn() they will construct a “access-by-reference” closure and the if the expected type is @fn() or ~fn() they will not. This seems non-ideal but equivalent to the situation today.

Bare functions

The type of “bare functions” (that is, function items which do not close over anything) is simply fn:send(T*) -> T. To use a bare function as a closure, you must prefix the bare function with an appropriate sigil (&, @, or ~).

For example, the following snippet uses a function inc() as the argument to vec::map:

fn inc(x: int) -> int { x + 1 }

fn inc_all(vs: [int]) -> [int] {
     vec::map(vs, &inc)
}

Here the expression &inc has the (fully elaborated) type &static.fn:send(int) -> int. This is a subtype of the expected type that vec::map requires: &fn:move(int) -> int.

To send a bare function between tasks you might write:

fn task_body() { ... }

fn spawn_task() {
    task::spawn(~task_body)
}

Representing closures

The representation of closures will change somewhat. Before a closure was the pair of a function pointer with an environment pointer. Now a closure will be a pointer to a structure like:

struct closure {
    void *fptr,
    type_desc *td,
    ... // (closed over data)
};

I thought at first that LLVM might not be able to track a function pointer in this case, but experiments suggest that it can. In general, LLVM does a good job of tracking and constant propagating alloca’d memory with precision.

Conceivably, it might be slower to perform an extra load before the call (that is, to call c->fptr and not c.fptr) if the closure pointer is not in cache. But reasoning about the cache without experimenting is always risky, and this particular load seems unlikely to matter, as the closure will soon be accessing its environment, and in that case you’d have to bring the environment into cache anyhow.

There is one unambiguous, if minor, downside. In the old scheme, bare functions used as a closure of type fn@ or fn~ could pair the function pointer with a NULL environment. But in this new scheme a @fn or ~fn will require allocation, because the runtime will expect to be able to free such pointers like any other pointer. Such pointers are quite rare though compared to &fn types, and something like &inc can be pre-allocated statically (actually we could use tagged pointer tricks, I suppose, but it doens’t seem worthwhile).

Pros and cons

To me the real question is whether the system feels simpler on net given the introduction of dynamic size types. I think it does, but obviously this is a subjective question. To me, the benefits of the following are pretty substantial:

  • one function type instead of five;
  • types like @fn(int) -> uint and @vec<int> seem to have a clear meaning once you are accustomed to @ meaning pointer, vs fn@(int) -> uint and [int]/@;
  • the difference between vectors and slices (and strings and substrings) is clear, currently I think there is plenty of room for confusion.

Iface Types

Yesterday I wrote about my scheme for paring down our set of function types to one type, fn:kind(S) -> T. When I finished writing the post, I was feeling somewhat uncertain about the merits of the idea, but I’m feeling somewhat better about it today. I really like the idea that top-level items have the type fn:kind(S) -> T and that you therefore give them an explicit sigil to use them in an expression; this allows us to remove the “bare function” type altogether without any complex hacks in the inference scheme.

Anyway, I didn’t talk at all about iface types yesterday, but they have a place in this scheme too. An iface type, also called a boxed iface, is basically the pair of a vtable with a self pointer. Today this is hard-coded to be a GC’d ptr (@), but I want to change this as it is very limited: iface types are relatively expensive to construct (requiring allocation, RC overhead, etc) and they cannot be sent between tasks.

Under my proposal, an iface type would be written id:kind where id is the name of the interface and kind is an optional kind bound that applies to the receiver. The type is dynamically sized, because the value that is represented is something like:

struct iface_instance {
    void *vtable;
    type_desc *td;
    ... // self data is represented inline
}

This proposal therefore allows you to construct things like:

@id      (today's "boxed iface")
&id      (an iface instance allocated on the stack)
~id:send (a sendable iface instance)

New interface instance construction syntax

There is one other change I’d like to make, which is independent but seems to fit. Today, iface types are constructed using as. I am not crazy about this because as is normally our type cast operator, but iface type construction is not a type cast. It may perform allocation etc. The as construction is also very wordy, requiring one to specify the desired iface rather than having it inferred, and it has an awkward requirement that :: be used for any type parameters on the type.

As a replacement I propose we make use of the iface keyword in expressions, so that iface::<T>(v) would construct an instance of the iface type T for the value v. Like all type parameters, T may be left off and inferred from context. So typically you would just write iface(v), as in this example (here I assume the current iface types, rather than the ones I will describe shortly):

iface an_iface<T> { ... }
impl of an_iface<int> for int { ... }
fn foo(i: an_iface<int>>) { ... }
fn bar(i: int) { foo(iface(i)) {

In contrast, the fn bar in the old syntax looks like:

fn bar(i: int) { foo(i as an_iface::<int>) }

At first I wanted to make ifaces into constructor functions with a signature like:

fn an_iface<I:an_iface>(i: I) -> an_iface

but this doesn’t fit with my proposal above, as if the type an_iface is a type of dynamic size, as it cannot be returned (also, how does one specify the sendability bounds? They would have to be added as bounds to the type I, etc)

Fn Types

As you loyal readers know, I am on a quest to make the Rust type system more orthogonal with respect to the kind of pointer in use, by which I mean that I want to have the three pointer sigils (@, &, and ~) indicate where memory is located and the other types indicate what value is to be found at that memory. Right now there are a few cases where we conflate the two things into one type. The first, vectors and slices, I discused in a recent post. This post discusses the second case: function and interface types.

I’ll sketch out the idea; there are however some details I have yet to work out. Actually the plan imposes some downsides and I’m not 100% sure if I’m in favor yet. Though I think the simplicity of the type system is a win (simplicity for people using it, that is).

Background

We currently have five function types:

  • fn@(S) -> T, or “boxed closure”;
  • fn~(S) -> T, or “unique closure”;
  • fn&(S) -> T, or “stack closure”;
  • fn(S) -> T, or “any closure”;
  • native fn(S) -> T, or “bare function”.

What distinguishes these closures is both the kind of pointer in which their environment is stored as well as the kind of data which can be stored in the closure itself:

  • The boxed closure (fn@) uses a normal GC’d pointer (oft called a boxed pointer, but I am trying to move to more descriptive terminology) to store its environment. The environment may contain arbitrary values but it does not contain any references to the stack. Copying a boxed closure is cheap because the environment can be aliased (currently, this means that the environment is ref counted).
  • The unique closure (fn~) uses a unique pointer to store its environment. All data must be sendable, which basically means tree-shaped. Copying a unique closure is expensive, because the environment cannot be aliased, and so copying the closure results in a deep copy of all the closed-over data. This has the upside that unique closures can be sent between tasks.
  • The stack closure (fn&) uses a reference to store its environment. Unlike the other closures, no data is stored in the environment itself. Rather, the environment consists of pointers to an outer stack frame. Copying a stack closure is very cheap. Because fn& contains by-ref pointers into its parent stack frame, it is restricted in where it can appear (though these restrictions can probably be loosened some the work on lifetimes is complete).
  • The any closure (fn) is just the supertype of the other three. It commonly appears in function signatures where any sort of closure would be acceptable (vec::each(), for example).
  • A bare function (native fn) is just a function pointer with no environment at all (it is therefore not a closure at all). It is never used in practice, as it is in fact a subtype of the various closure types.

One important detail is that closures are represented as the pair of a function pointer along with a pointer to the environment. In the case of a bare function, the pointer to the environment is always NULL.

The proposal

I want to have just one function type. In practice, as today, this would most commonly be written as fn(S) -> T, but the type in its fully explicit glory would be:

fn:kind(S) -> T

Like my proposed vector type vec<T>, this function type has an unknown static size. At runtime, it would be represented by a structure like:

struct fn {
    void *code_ptr;
    ... // environment data is stored inline
};

As with vec<T>, the type fn(S) -> T has an unknown size and therefore must basically always be referenced by pointer (@fn(S) -> T, ~fn(S) -> T, &fn(S) -> T).

The kind portion of the function type indicates a bound on the closed over data. It can by copy or send. If it is omitted, then there is no bound. In practice, I imagine that send is the only kind that would ever be useful.

The mapping between the current function types and my proposal would be:

fn@(S) -> T         becomes        @fn(S) -> T
fn~(S) -> T         becomes        ~fn:send(S) -> T
fn&(S) -> T         becomes        &fn(S) -> T
fn(S) -> T          becomes        &fn(S) -> T
native fn(S) -> T   just goes away

Details

Literal syntax

I can imagine a couple of alternatives here. The basic issue is distinguish between “closures that reference things on the stack frame” and “closures that copy things out of the stack frame”.

I think my preferred solution is to say that the explicit fn form always copies out of the stack frame. So something like:

let foo = fn@(x: int, y: int) -> int { x + y };

would become

let foo = @fn(x: int, y: int) -> int { x + y };

Note that there is no bound specified (indicating no bound on the closed-over data). Of course, any data that gets copied into the closure must be copyable; but if data is moved into the closure (for example, if it is the last use of the data, or an explicit capture clause is used), then the data can have any kind. This is the same as today.

If we wanted a unique closure, which today is written:

let foo = fn~(x: int, y: int) -> int { x + y };

you would write

let foo = ~fn:send(x: int, y: int) -> int { x + y };

This is somewhat wordier than today, but the truth is that we rarely (if ever) write unique closures by hand. Instead, you employ the sugared closure form.

Sugared closures

Sugared closures are written using a Ruby-like notation:

for vec.each { |item| ... }    // inferred to stack closure

task::spawn {|| ... }          // inferred to unique closure

They would continue to operate as today. This means that we’ll infer the kind of closure pointer and other facts based on the expected type. I am still not crazy about this inference (although I put it in) but the last time I proposed taking it out this was unpopular.

Bare functions

One advantage of the current approach is that a bare function (which is just a function pointer) can be converted into a closure by pairing it with a null pointer. This no longer works under this system except for region pointers, so when bare functions are converted to @ or ~ functions we’d have to allocate a little stub to convert the call. Maybe the sigil should be written explicitly, so that function items have a type of fn(S) -> T. You would then write vec::iter(v, &foo) to apply the top-level function foo to each item in the vector, for example. Hmm.

Summary

So yeah, that’s the rough idea. I feel like the current system does work more smoothly in some regards, so I’m not yet sure if the idea is overall a win, but I wanted to note it down.

Borrowing Errors

I implemented a simple, non-flow-sensitive version of the reference checker which I described in my previous post. Of course it does not accept the Rust codebase; however, the lack of flow-sensitivity is not the problem, but rather our extensive use of unique vectors. I thought I’d write a post first showing the problem that you run into and then the various options for solving it.

Errors

The single most common error involves vec:len(). There are many variations, but mostly it boils down to code code like this, taken from the io package:

type mem_buffer = @{mut buf: [mut u8],
                    mut pos: uint};

impl of writer for mem_buffer {
    fn write(v: [const u8]/&) {
        if self.pos == vec::len(self.buf) { ... }
        ...
    }
}

The problem lies in vec::len(self.buf). This is considered illegal because the vector self.buf resides in a mutable field of a task-local box. Therefore, the algorithm assumes that vec::len() may have access to it and could, potentially, mutate it, which would cause the vector to be freed. Bad. This call would be fine if the field buf were not mutable. In that case, even if vec::len() had access to the mem_buffer, it could not be able to overwrite the field.

In fact, all of the errors I see right now (about 46 of them across the standard library and rustc) are calls to vec::len() or vec::each() with the vector in question living in mutable, aliasable memory. It is, currently, the only way to accumulate items in a vector, after all. However, I haven’t implemented the full check—in particular, I didn’t implement the check that pattern matching a variant or through a box requires immutable memory, and so I imagine there will be some more errors related to that once I do that.

Solution #1: Swapping

Of course, this problem is not really a surprise. The solution I had in mind for handling unique data that is located in mutable, aliasable memory is to swap that unique data into your stack frame, where the compiler can track it (inspired by Haller and Odersky’s work on uniqueness, though I’m sure the technique predates them). So the code from the io package could be rewritten as:

type mem_buffer = @{mut buf: [mut u8],
                    mut pos: uint};

impl of writer for mem_buffer {
    fn write(v: [const u8]/&) {
        let mut buf = [];
        buf <-> self.buf;
        if self.pos == vec::len(buf) { ... }
        ...
        self.buf <- buf;
    }
}

This makes use of the little known swap (<->) and move (<-) assignment forms. Now the buffer being passed to vec::len() is in the local variable buf, not the contents of some @ box; this means that vec::len() could not possily reassign it because there are no aliases to the local variable buf.

It’s a bit of a pain to write this swapping code each time. It could of course be packaged up in a library (here, I’ve included various mode declarations, though these would be unnecessary in a purely region-ified world, as ownership would be done by default):

type swappable<T> = {mut val: option<T>};
impl methods<T> for swappable<T> {
    fn swap(f: fn(+T) -> T) {
        let mut v = none;
        v <-> self.buf;
        if v.is_none() { fail "already swapped"; }
        self.val <- some(f(option::unwrap(v)));
    }
}

Swappable could then be used to build up a dynamically growable vector library:

type dvec<T> = {buf: swappable<T>};
impl methods<T> for dvec<T> {
    fn add(+e: T) {
        self.buf.swap { |v| v + [e] }
    }

    fn add_all(v2: [T]) {
        self.buf.swap { |v| v + v2 }
    }

    fn each(f: fn(T) -> bool) {
        self.buf.swap { |v| vec::each(v, f); v }
    }
}

Attempts to add to a vector that is being iterated over would fail dynamically (basically a more reliable version of Java’s “fail-fast iterators”).

Solution #2: Pure functions…?

Still, it’d be nice if one could invoke vec::len() and vec::each() even when the data is in a mutable location. After all, neither of those functions make any changes, and we know that. One solution I considered was that we could make use of the pure annotation in a kind of lightweight effect system.

The basic idea would be that pure functions are functions which do not modify any aliasable state (today pure functions disallow mutation of any state, including data interior to the stack frame; we should fix this regardless). However, drawing on more work by the Scala folks, we can actually generalize pure functions somewhat farther: we could allow them to invoke closures so long as those closures are given in the arguments. The idea is basically that a pure function is one which does not make any modifications to aliasable state except possibly through closures which the caller itself provided.

These changes would allow us to declare vec::len() and vec::each() as pure. In the case of vec::len(), that would be sufficient to ensure safety without any form of alias check. Horray!

But don’t get too excited: even if vec::each() is declared pure, we still cannot accept calls like the ones we saw before:

vec::each(self.buf) { |e|
    ...
}

The reason is that buf is still stored in aliasable, mutable state, and so we have to be sure that the loop body is safe. This can be achieved when the vector is stored in a local variable, as we can monitor for writes to that variable. But if the vector is in an @ box, we have to consider any possible alias of that box. And this leads us to our next possible solution, alias analysis.

Solution #3: Alias analysis

As I said in my previous post, I am not 100% sure of what analysis we are doing today. But if I were to design an alias-based analysis to address this shortcoming, I imagine if would work something like this:

  • Each callee is guaranteed that every reference is stable (points at memory which will not be freed) no matter what actions is takes. This means that the callee is free to call any functions it likes, including closures, because the caller has guaranteed that all functions which the callee has access to are harmless.

In particular, vec::each(v, f) could safely invoke the f() on each item in v without fear of v being freed. It’s up to the caller to guarantee that f will not have any harmful effects.

But how can the caller do this? There are two basic techniques. The first is to rely on the guarantees it gets from the outside. So, if you have a function like:

fn map<T,U>(v: [T]/&, m: fn(T) -> U) -> [U] {
    let mut r = [];
    for vec::each(v) { |e|
        r += m(e);
    }
    ret r;
}

Here, the call to m() is known to be safe because both v and m were given as parameters, so it actually the job of the caller of map() to ensure that they do not conflict.

If we can’t rely on a guarantee from the outside, then, we have to look at the types. For example, going back to our example of the buffer, if we had a loop like:

for self.buf.each { |e|
    some_ptr.buf = [];
}

Here the assignment to some_ptr.buf would be disallowed if some_ptr had the same type as self: after all, maybe it is an alias of self.

We can apply similar reasoning to functions that are invoked:

for self.buf.each { |e|
    clear_buf(some_ptr);
}

Without knowing what set_buf() does, we’d have to reject this because it has access to data of the same type as self (and hence, potentially to self itself).

The cool thing about an analysis like this is that it would allow most of the examples in the standard library to compile mostly as is. But there are some downsides.

First, it’s not clear to me that an analysis like this “scales well”. By scales well I do not mean performance but rather that, while library code tends to pass, I am not sure that uses of library code will pass. For example, suppose I have a shared, growable vector that encapsulates a unique pointer, rather like Java’s ArrayBuffer. And now I have some library code that does:

 my_vec.each { |e| do_some_processing(e); }

where my_vec is one of these array buffers. Using an alias check, it is possible to define the ArrayBuffer.each() method, but that essentially pushes the requirement to the caller to validate that the body of the each() loop will not modify my_vec. Since my_vec is aliasable, this means that do_some_processing() must not use any array buffers of its own.

Admittedly, we haven’t run into these scaling problems so much, but I am not sure how much to draw from that. For one thing, the analysis is buggy today, so it may be that we should be seeing more errors than we are. For another, all vectors are unique now, but this is causing us scaling problems, and we are starting to move away from that.

A second concern about the analysis is that it is anti-encapsulation. It requires the compiler to have full details about the types of all data that may be accessed. When you have types like closures or interfaces types, this information is not available, and so the more we use these abstractions, the worse the analysis performs. Furthermore, it becomes impossible for modules to “hide” the implementation of a type—whenever any type definition anywhere changes, all downstream code must be recompiled or else the memory safety can no longer be guaranteed. Admittedly, due to Rust’s support for interior types (not everything is a pointer) and inlining, this is already often the case, but it should still be possible to define modules that make use of opaque pointer types in the future, allowing for changes to the implementation where no recompile is necessary.

UPDATE: A further thought on this matter. This is a bit different from requiring recompilation as a matter of course (e.g., because the size of a record changed)—that is, there is no guarantee that the downstream compilation will succeed. Now, if I add a use of some vector library in the upstream code, downstream code may fail to compile, even if the use of the vector library is purely internal and not exposed through the interface.

Summary

I am still leaning towards solution #1, though I appreciate our alias analysis more and more. Actually, the fact that I only encountered 46 errors seems pretty decent, especially since most of them are clustered together. However, I do expect more such errors when I implement the pattern matching safety checks, but there we can make better use of fine-grained copies and so I expect that to be less of a problem.

Oh, and a final note regarding flow sensitivity: I think I will implement a flow-sensitive variant of the checker (it’s a small change from what I have today), but since we never take the address of locals today, it’s a moot point anyhow.

Borrowing

I’ve been working for the last few days on the proper safety conditions for borrowing. I am coming into a situation where I am not sure what would be the best approach. The question boils down to how coarse-grained and approximate our algorithm ought to be: in particular, ought it to be flow sensitive? But let me back up a bit, first, and provide a bit of background.

Background

Rust bucks the “new language” trend by not having a purely garbage-collected model. We feature things like interior and unique types which can be eagerly overwritten. This means that we have to be very careful when we create temporary references to those kinds of values that these references remain valid.

Here is an example of unsafe code:

fn main() {
    let mut v = [1, 2, 3];
    for vec::each(v) { |i|
        v = [];
    }
}

What is happening here is that we are iterating over the vector v but, during the iteration, setting the local variable to be the empty vector. Because vectors are unique, this will cause the original vector to be immediately freed—while the iteration is occurring!

Now, Rust today has an alias checker which is supposed to prevent these sorts of errors. However, it has some flaws: for one thing, it admits that erroneous program I just showed you (that’s just a bug). For another, the check is rather complex. Sufficiently complex that I don’t really understand the conditions that it is enforcing. The core is a type-based alias analysis that tries to figure out what kinds of data a function could possibly reach. But when it finds potentially dangerous aliasing going on, the algorithm will sometimes silently copy the data in question, if that seems harmless enough, or other times issue warnings or report errors.

In defense of the current algorithm, however, the fact is that if you want to remain flexible and not force programmers into too many contortions, it’s hard to come up with a simple set of rules. We’ll see that as I go on.

An alternative

I have been working on a simpler alternative which is based more on types and less on alias analysis. The original idea was to base this analysis purely on the declared mutability of local variables, fields, and so forth. We would then conservatively reject programs where memory safety relied on a potentially mutable location not being mutated.

I think, however, that I’ve decided this is so conservative as to be unusable. Consider this harmless program, for example:

let mut v = [1, 2, 3];
let l = vec::len(v)

Under the rules I just gave you, this program would in fact be illegal. The reason is that vectors and unique and vec::len() is created a transient reference to the vector it takes as argument. The safety analysis however sees that the local variable v is declared as mutable, and thus consider this unsafe to be unsafe: what if v were somehow changed by vec::len()?

Of course, we know that this is impossible. vec::len() cannot just gin up a pointer into the caller’s stack frame (well, not without an unsafe block, anyway). In this case, that’s pretty obvious: we never even took the address of v. The question is where you draw the line. How intelligent should the compiler get? In general, smarter seems better, but there are two countervailing forces: (1) if the analysis is too complex, it’s hard to tell why it’s giving you an error; (2) the more complex the analysis, the greater the chance of bugs in the safety checker itself. Let me tell you, it’s not fun to spend a day tracking down a memory bug that the language supposedly guarantees to be impossible.

I’ve been working on a compromise analysis which does not attempt to do alias analysis but which does track which parts of the stack frame are aliased or may be borrowed. The analysis makes very conservative assumptions about what a function can reach: it assumes that if a non-unique pointer exists to a given memory location, the function can access it. Using this analysis, we can allow functions to borrow data that is stored in mutable variables that are not modified, so long as the address of those mutable variables was not taken.

Under these rules, the vec::len() example is fine. An example like this is also fine:

let mut v = [1, 2, 3];
for vec::each(v) { |i|
    io::print(#fmt("%d", i));
}
v = [];

Here, we iterate over the vector v and then, after the iteration, clear v to the empty vector. This mutation is fine because the reference to the vector created in vec::each() only had a lifetime equal to the for loop itself.

The example we saw before, where the vector is assigned not after the loop but rather in the middle, would of course still fail to compile:

let mut v = [1, 2, 3];
for vec::each(v) { |_|
    v = [];
}

Specifically, the analysis would report that the assignment to v inside the loop conflicts with the borrow of v on the line before. In effect, although the variable v is declared as mutable, it becomes temporarily immutable during the loop.

The algorithm at a high-level

The key ideas that the algorithm tries to enforce is this:

  • borrowing an @T pointer is safe regardless of where it is stored, because we can just temporarily increase the ref count of the pointer for the duration of the loan;
  • borrowing a ~T pointer (or a unique vector) is safe if it is stored in a location that can be considerd immutable for the duration of the loan.

We can consider a location be immutable under two conditions:

  • the type system guarantees it to be immutable. For example, the contents of a box of type @T are immutable, as is the value of a local variable that is not declared as mutable (with one exception, see below);
  • the location is uniquely tied to the stack frame itself and the compiler does not observe any assignments to that location, nor are there any mutable aliases in scope.

The first case is the simple set of rules I wanted to enforce earlier. The second case is the more complex set I described later, where can say that a local variable is “temporarily immutable”. In fact, we can go a bit further than just local variables, and also talk about the contents of records stored in the stack, or the contents of unique pointers found in the stack, or sequences of such things. All of those cases share the property that, unless the user takes their address with the & operator, they cannot be aliased outside the function itself.

That means that we would accept any of the following equivalent programs:

let mut v = [1, 2, 3];
for vec::each(v) { |i|
    io::print(#fmt("%d", i));
}
v = [];

let r = {mut v: [1, 2, 3]};
for vec::each(r.v) { |i|
    io::print(#fmt("%d", i));
}
r.v = [];

let u = ~mut [1, 2, 3];
for vec::each(*u) { |i|
    io::print(#fmt("%d", i));
}
*u = [];

However, we would reject this similar-looking program:

let b = @mut [1, 2, 3];
for vec::each(*b) { |i|
    io::print(#fmt("%d", i));
}

The reason that this program is rejected is that we assume that vec::each() has access to every @T value (to put it another way, we assume that every aliasable value escapes). So that means we cannot prove that vec::each() will not overwrite the contents of *b (a more involved analysis might be able to see that b itself is never leaked out of the stack frame).

If you want to have unique pointers within mutable boxes, you have to bring them into your stack frame to work with them, generally making use of the little known swap operator <->. For example, the prior program might be written:

let b = @mut [1, 2, 3];
let v = [];
*b <-> v; // bring [1, 2, 3] into our stack frame
for vec::each(v) { |i|
    io::print(#fmt("%d", i));
}
*b <-> v; // replace it

This limitation is basically the same as that taken by other unique pointer systems.

This is one corner case I haven’t discussed yet. Even immutable local variables can be moved (sent, for example, to another thread). So we have to remember which immutable local variables are in use and prevent moves from occurring.

And now… the question I have been leading up to.

Until now I haven’t talked at all about the & operator. One of the nice features enabled by the work on references and pointer lifetimes is to allow the user to take the address of a variable on the stack (previously, this could only be done implicitly through reference-mode arguments). This feature is handy, and I’m a big fan of making modifications to the local stack frame explicit, but it also introduces complications. Consider this variant of our usual example:

let mut v = [1, 2, 3];
let w = &mut v;
for vec::each(v) { |i| ... }

My check would actually reject this program. The reason is that it assumes vec::each() has access to all aliased data—including w. Therefore, as in the case of @mut T types, we cannot prevent vec::each() from overwriting v indirectly by modifying *w. Here you would get an error which points out that the existence of an in-scope alias for v means that it cannot be borrowed by vec::each(). This seems reasonable to me.

But how smart is the compiler? For example, is this program allowed?

let mut v = [1, 2, 3];
let mut x = [4, 5, 6];
let mut w = &mut x;
for vec::each(v) { |i|
    w = &mut v;
}

Now, on the first iteration of the loop, v is unaliased, but it becomes aliased during the loop. Under our normal assumptions, then, this program must be rejected, for fear of vec::each() assigning to *w sometime after the first iteration.

Ok, what about this program:

let mut v = [1, 2, 3];
for vec::each(v) { |i| ... }
let mut w = &mut v;

Here I have moved the alias so it comes after vec::each(). This should presumably be ok.

But there is one wrinkle. Sometimes it is hard to say when code will execute, particularly around closures. For example:

let mut x = [4, 5, 6];
let mut v = [1, 2, 3];
let mut w = &x;
debug::indent({||
    for vec::each(v) { |i| ... }
    w = &mut v;
})

Here, the function debug::indent presumably does something like cause all debug messages that occur during its argument to be indented. So it probably only runs the argument closure once. But we don’t know that. So we’d have to reject this program, just in case debug::indent() called its closure argument twice.

A similar problem crops up if we allow stack closures (as opposed to the various kinds of copying closures) to be assigned to variables. This is currently illegal but which I wouldn’t mind making it legal someday. But then a flow-sensitive analysis would have to understand (and reject, in this case) code like this:

let mut x = [4, 5, 6];
let mut v = [1, 2, 3];
let mut w = &x;
let foo = fn&() {
    for vec::each(v) { |i| ... }
    w = &mut v;
};
foo();
foo();

Now on the second call to foo() it’s possible that vec::each() might have access to w.

So the options:

So here are various options from dumb to smart:

  1. the compiler does a flow-insensitive analysis with respect to which references exist. All of the examples in the previous section are illegal. To make them safe, you have to explicitly introduce a block, like:

     let mut v = [1, 2, 3];
     for vec::each(v) { |i|
         ...
     }
    
     {
         let w = &mut v; // limit the scope of `&`
     }
    
  2. the flow-sensitive analysis rules I described in the previous section are ok, when things are unclear just do your best;

  3. this whole analysis is too dumb. It should try to determine which references actually escape the stack frame and track what they point at and so forth. Anything it can possibly figure out it should figure out.

I am torn at the moment. I started with #2 but I am tempted to try #1 and just see how painful it really is.

In Favor of Types of Unknown Size

I’m still thinking about vector and string types in Rust and I think I’ve decided what I feel is the best approach. I thought I’d summarize it here and make the case for it. If you don’t know what I’m talking about, see this post for more background. I’ll forward this to the mailing list as well; I’m sorry if it seems like I’m harping on this issue. I just think vectors and strings are kind of central data structures so we want them to be as nice as possible, both in terms of what you can do with them and in terms of the notations we use to work with them.

Summary

First, The Grand ASCII Art Table, summarizing everything (sad fact: M-x picture-mode is way more convenient than making an HTML table). Blank spaces indicate things that are inexpressible in one proposal or the other (for better or worse).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
+---------------------++---------------------+
| This proposal:      || Original proposal:  |
|--------+------------||-------+-------------|
| Type   | Literal    || Type  | Literal     |
|--------+------------||-------+-------------|
| [:]T   |            || [T]   | [1, 2, 3]   |
| []T    | [1, 2, 3]  ||       |             |
| &[]T   | &[1, 2, 3] ||       |             |
| @[]T   | @[1, 2, 3] || [T]/@ | [1, 2, 3]/@ |
| ~[]T   | ~[1, 2, 3] || [T]/~ | [1, 2, 3]/~ |
| [3]T   | [|1, 2, 3] || [T]/3 | [1, 2, 3]/_ |
|        |            ||       |             |
| substr |            || str   | "abc"       |
| str    | "abc"      ||       |             |
| &str   | &"abc"     ||       |             |
| @str   | @"abc"     || str/@ | "abc"/@     |
| ~str   | ~"abc"     || str/~ | "abc"/~     |
|        |            || str/3 | "abc"/_     |
+---------------------++---------------------+

The types []T and str would represent vectors and strings, respectively. These types have the C representation rust_vec<T> and rust_vec<char>. They are of dynamic size, meaning that their size depends on their length. The literal form for vectors and strings are [a, b, c] and "foo", just as normal.

The types [:]T and substr represent slices of vectors and strings. Their representation is the pair of a pointer and a length. They are each associated with a lifetime that specifies how long the slice is valid, and thus can be more fully notated as [:]/&r T and substr/&r, but users will not have to write this very often, if ever.

Vectors, strings, and fixed-length vectors are implicitly coercable to slices just as today. Furthermore, one can explicitly take a slice using a Python like slice notation: v[3:-5] or v[:] to take a slice of the entire vector. It is also allowed to take a slice of a slice. This is where the : in the slice type comes from: it’s supposed to echo this syntactic form.

Fixed-length vectors are written [N]T. They are represented just like a C vector T[N]. The literal form is [| v1, ..., vN]. The leading | serves to distinguish a fixed-length vector. It is random but whatever, this is a specialized use case for C compatibility. The length of the literal form is always derived from the number of items. I opted not to include a way to represent fixed-length strings for the same reasons I previously stated.

Advantages

The big advantage is that everything is written the way that seems to me to be most natural. For example, a vector on the stack is &[1, 2, 3]. A task-local vector is written: @[1, 2, 3]. unique vector is written ~[1, 2, 3]. Same with strings.

I also like the indication of where memory is allocated is orthogonal to what is stored in the memory. The type and unary operators &, @ and ~ tell you where the memory is allocated, and the types which follow tell you what you will find at that memory. If we have types like [1, 2, 3]/@, they combine where the memory is allocated with what you will find there (to be clear, that is by design, so as to avoid the disadvantages in the next section).

There is no need for a literal form for slices. If you create a vector and then use it where a slice is expected, the type will be coercable, so no error will result.

Disadvantages

The primary disadvantage is that the types []T and str are of dynamic length. This implies a kind distinction that does not exist today. I’d be inlined to just make a rule that types of dynamic length cannot be used as the types of local variables, fields, vector contents, nor the values of generic type parameters (and maybe a few other places). Later we could add an explicit kind if that seems necessary. It basically means you would get an error message like “the type [T] has unknown size cannot be used as the type of a local variable, use a pointer like @[T] or &[T]”.

Having types of unknown size are a complication, to be sure, but I feel it is a lesser complication than having special types, expression forms, and rules for vectors and strings. Furthermore, this same case (types of unknown size) has come up from time to time when thinking about other possible future designs, so I am not sure that it can be avoided.

A second, more subtle point is that slices are no longer the shortest type in terms of how they are written, although they are probably the most common thing you will want to use. I am not too worried about this either: [:]T is still fairly short and we will use it ubiquitously. One thing I don’t like is that I find [:] somewhat hard to type. Maybe that will get easier, or maybe something else (e.g, [.] and a slice notation of v[1..3]) would be better.

Other kinds of variably sized types…?

Records of dynamic size are common in C, and we may ultimately have to be able to model that (though we could admittedly use the C trick, where it pretends all types have fixed size when in fact the memory allocated may be greater, combined with unsafe pointers). Still, there is a legitimate use case for allocating a variably-sized vector interior to a record even in Rust code, and we could support that (it’s the same trick that we in fact use to implement vectors themselves—if it’s important enough for us, maybe it’s important enough for our users).

Another example would be base types. We may sometime want to allow records or classes that can be extended with subtypes. In that case, we could say that the base types have variable size, since the number of fields they possess are unknown—this would mean that you only refer to them by pointer, preventing the common C++ problems of slicing and unsafe array arithmetic.

I’m not sure where else this comes up. Perhaps that’s it.

Permission Regions for Race-free Parallelism

I’ve been making a point of reading academic papers on the train as I ride home. It’s so easy to get behind with the sheer quantity of work that is being produced. Anyway, it occurred to me that I ought to try and summarize the papers I read on this blog so that I can I remember my reactions to them.

I’ll start with “Permission Regions for Race-Free Parallelism”, by Westbrook, Zhao, Budimilic, and Sarkar. The basic idea builds off of Habanero Java, which is a kind of fork of the X10 language that Sarkar and his group work on. The basic idea of the paper is to add a language construct permit which looks like:

permit read(x1,...,xn) write(y1,...,yn) {
    /* this code may read fields of x1...xn and write
       fields of y1...yn */
}

For example, imagine a method pop() that removes an item from the front of a linked list:

Node pop() {
    Node tmp = this.next;
    if (tmp != null)
        this.next = tmp.next;
    return tmp;
}

This would be annotated like so:

Node pop() {
    permit write(this) {
        Node tmp = this.next;
        if (tmp != null)
            permit read(tmp)
                next = tmp.next;
        return tmp;
    }
}

A dynamic monitoring system will then guarantee check for races at the granularity of the permit blocks. An effect system also allows a method to be called under the stipulation that reads/writes are permitted of its parameters. Permission regions are not required for final fields, naturally.

Finally, they support a view construct for arrays which allows you to declare permission to access a portion of an array:

region r = ...;
int[.] subA = A.subView(r);
permit write(subA) { ... }

This is reminiscent of my own divide() method in PJs.

Interestingly, they allow the local variables within a permit section to be modified. Presumably each such assignment will lead to a new dynamic conflict check.

To reduce the annotation burden, the compiler will automatically insert permission regions. Basically they find the highest point in the AST that includes all accesses within a given method, but they do not cross async or isolated (the HJ keywords for spawning tasks and for creating transactions). They find this is usually right.

The whole point of this exercise is to reduce the overhead and (I believe) improve the accuracy of dynamic checks. Naturally, the slowdown for monitoring for data races varied dramatically, but it was generally around 1.5 to 2x. There are some exceptions, such as raytracer, which went as high as 22x.

My reaction:

Summary: interested but mildly skeptical.

Their performance numbers seem pretty decent for dynamic monitoring, but I’m not sure it meets their goal of “always on”. HJ’s target audience after all is scientific computing, and 2x slowdown in that field seems like a big deal to me. Still, a lot of people are using R and Python etc so maybe it is “fast enough”. And of course they can optimize further, I’m sure.

The actual semantics of their race check are sort of interesting. A narrow focus on data-races over other kinds of races can lead to programs that do the wrong thing even though they never have any races. The classic example is Java code like this:

void addIfEmpty(/*shared*/ Vector v, Object o) {
    if (v.isEmpty()) v.add(o);
}

These two statements were presumably intended to be atomic, but of course they may not execute atomically. Nonetheless, since Vector in Java is a fully synchronized class, there will be no data races under the technical definition of data races.

In any case, declaring permission regions seems to suggest a sensitivity to this issue, however the use of compiler inference kind of works against this intutition, since the compiler may not know the proper places to insert the checks. (Here, for example, if fully automated the compiler would still insert the permission regions within the vector calls themselves)

But I think, in the end, races vs data races is besides the point. That is, the permission regions are not intended as a kind of “declaration of things that go together” but rather as a practical means of reducing overhead and controlling the granularity of checks for detecting data races. Basically—I gather that they assume the system will be always on and, generally, ignored by programmers. But if they come up against an issue where performance is a problem, they will start using the effect system and explicit permission regions to push these checks to a higher-level.

I’m not sure how effective this will be, however. A lot of the overhead seems to derive from the array view checks, which cannot be automated. Furthermore, when the number of items to be accessed is unbounded, such as when walking a linked list, you cannot push the permission regions out any bigger. It would be nice to see numbers that compare the overhead before they tweaked it and after to get an idea of how much reduction is possible.

It brings to mind my own efforts with PJs etc: I am excited to see that bear fruit, because I think the overhead of such dynamic checks can be made extremely low. Of course my system would not be nearly as flexible as theirs, which is why the checks are so cheap. But basically I like the idea of dynamic checking for races, but I think it is not necessary something you want to just layer on top of a rather broken “everything shared and mutable all the time” system, because the overheads are just too high. Rather, you start with a sane foundation, and you should be able to monitor for violations relatively cheaply and locally.

References

I want to do an introduction to the regions system I’ve been working on. This is work-in-progress, so some of the details are likely to change. Also, I’m going to try some new terminology on for size: although it has a long history in the literature, I think the term “region” is not particularly accurate, so I am going to use the term “lifetime” or “pointer lifetime” and see how it fits.

In this post I’m just going to show some examples of how the new features can be used. In the next post, I’ll lift the curtain a bit and explain how the checks work.

Introduction

Rust has always (at least, as long as I’ve been around) had three sorts of pointers: @T, which is a task-local pointer into the heap; ~T, a unique pointer into the heap, generally (but not exclusively) used for sending data between tasks; and reference mode arguments, used to give a function a temporary pointer.

The goal of this work is to replace reference mode arguments with something more flexible. Reference mode arguments work quite well for many purposes, but they have one primary limitation: they cannot be stored into data structures.

So, in this branch, we (conceptually at least) remove reference mode arguments from the Rust Pointer Pantheon and replace them with reference types, written &T (this is actually a shorthand, as we will see later). I will refer to a variable of reference type as a reference.

References are basically generic pointers. They can point anywhere: into the stack, into the @ heap, into the ~ heap, even into the inside of a record or vector. They can point at anything that a C pointer could point at and can be used in many of the same ways; however, they are free from the errors that C permits. The type checker guarantees that references are always valid, so you can’t have a reference into freed memory, or into a stack frame that has been popped, and so forth.

We’ll get to the full details of how the safety check works later (probably in a separate post). First, I want to give some examples of using references.

Using references

Simple references and borrowing

Let’s create a record type point for use in our examples:

type point = { x: uint, y: uint };

Now, imagine that we have a function which wants to compute the slope of two points. It doesn’t particularly care where those points are allocated. You could write it like so:

fn slope(p1: &point, p2: &point) -> float {
    let y = (p2.y - p1.y) as float;
    let x = (p2.x - p1.x) as float;
    ret y / x;
}

OK, that was fairly straightforward. Now let’s look at how slope() might be called. First, assume that we have some routine which takes a vector of pairs of points allocated on the heap and computes the maximum slope of any of those pairs. Why you would want such a function, I don’t know, but this is how you would write it:

fn max_slope(ps: [(@point, @point)]) -> float {
    ps.max { |(p1, p2)| slope(p1, p2) }
}

You’ll notice that slope() is called with p1 and p2, which have type @point, not &point. The type checker happily accepts this, however, because a reference can point anywhere, including into the heap. This process of converting of kind of pointer to another is called borrowing.

The reason it’s called borrowing is that, in effect, the callee (slope()) borrows a reference from the caller (max_slope()). It is the caller’s job to ensure that this reference remains valid for the duration of the callee. In this case, there is no extra work required to make that true, but in some cases the compiler may be required to increment a ref count or maintain a GC root (this is highly dependent on how the @ heap is managed, naturally).

You can also borrow ~ pointers. This basically works the same as with @ pointers, except that the unique value cannot be moved away (for example, sent to another task) while it is borrowed. The reason for that is that, for the duration of the borrowing, the unique pointer is no longer unique. So if you sent it to another task, for example, then two tasks would have access to it. Even within a single task, if you gave the pointer away, then there would be multiple copies each claiming to be unique, which would lead to double frees and other badness. The key invariant that borrowing maintains is that, while a ~T may be temporarily aliased, all of the aliases are references, not other ~T pointers. So we can always identify the true owner once the borrowing expires.

Right now, borrowing can only occur in method calls. The borrowing lasts for the duration of the method call. In the future, borrowing will also be possible in alt expressions and when assigning a local variable with let. In the former case, the borrowing will last for the duration of the alt expression. In the latter case, the borrow will last until the local variable goes out of scope (until the end of the enclosing block, in other words).

Taking the address of local variables

Sometimes we wish to give away pointers into our local stack. For example, there is a routine today called vec::push(x, y) which has the effect of appending the value y onto the vector x (in place). This can be implemented using references like so:

fn push<T:copy>(v: &mut [T], elt: T) {
    *v = *v + [T];
}

Here the argument &mut [T] indicates a mutable reference: that is, a reference which can be used to modify the data it points at. The requirement to explicitly declare which pointers may be used for modification stems from Rust’s desire to make mutation explicit, and is analogous to the existing @mut T and ~mut T types.

To call push, we might write code like this:

fn accum() {
    let mut v = [1, 2, 3];
    vec::push(&v, 4);
    vec::push(&v, 5);
}

Here we used the & operator to take the address of a local variable so that we could pass it into the push() routine.

An aside: I believe that in the current implementation of the compiler you would have to write vec::push(&mut v, 4)—that is, you would have to declare when taking the address of v that you intend to mutate through this pointer. I believe there is no reason we can’t lift this restriction, however, and allow the compiler to figure it out for itself. (I rather prefer the explicit form in theory, because I like to make it clear when things are being modified, but I suspect it will be annoying in practice)

Copying into the stack

Right now, if you wish to create a record literal on the stack, you have to manipulate it by value. So you might write code like:

fn create_point() {
    let p1 = { x: 3u, y: 4u };
    let p2 = { x: 5u, y: 10u };
    let p3 = if cond {p1} else {p2};
    ...
}

Here the type of p{1,2,3} is point. But often we wish to manipulate values by pointer. In this case, that would make p3 a cheaper copy, for example. Using references, we can write something like this:

fn create_point() {
    let p1 = &{ x: 3u, y: 4u };
    let p2 = &{ x: 5u, y: 10u };
    let p3 = if cond {p1} else {p2};
    ...
}

Here we used the same & operator, but with an rvalue (an expression that is not assignable). This simply allocates space on the stack and copies the value into it. The corresponding type of p{1,2,3} would then be &point, where & is a reference into the stack of create_point().

Placing references into structures

Next let’s look at a case where we wish to store a reference into a structure. This example comes out of the Rust compiler, but it’s a common pattern in practice.

In the Rust compiler, there is a phase of processing called encode in which we generate the metadata for a compiled crate. During this encoding, we have a struct encode_ctxt that stores the various context which is required. Because this structure is only needed during this one phase, it is allocated on the stack, and we pass it from function to function using references (today, using a reference mode argument).

The code to create this encode context looks something like the following:

type encode_ctxt = { /* contents are not important */ };

fn begin_encoding(...) {
    let ecx = &{ /* allocate an encode context */ };
    for items_to_encode.each { |item|
        encode_item(ecx, item);
    }
}

Here you see that begin_encoding() creates a variable ecx, storing the data onto the stack. This context is then passed to each call to encode_item().

What can happen then is that some subpart of the encoding requires its own context. For example, in our metadata encoding, we sometimes have to serialize the AST for an inlinable function. This requires quite a bit more state, but it’s state that is specific to the inlining itself. So we can define a type inline_ctxt that will include both the encoding context ecx along with some other fields:

type inline_ctxt/& = {
    ecx: &encode_ctxt,
    ...
};

What you see here is that the type inline_ctxt is declared like any other record, but it has this /& following the name. This is a declaration that the type will contain references. The record itself then simply embeds the &encode_ctxt as any other field. Note: It’s possible that the /& might become inferred in the future rather than being explicit.

Now I can write functions that create and use the inlined context as follows:

 fn encode_inlined_item(ecx: &encode_ctxt, ...) {
     let icx = &{ecx: ecx, ...};
     ...
     some_helper_func(icx, ...);
     ...
 }

 fn some_helper_func(icx: &inline_ctxt, ...) {
     // ... can use icx, icx.ecx, etc ...
 }

References in boxes

In the previous example, we create a structure on the stack which contained a reference to some data living in an activation somewhere up the stack. It is also possible to place references into heap objects. For example, I could have allocated the inline_ctxt on the heap like so:

 fn encode_inlined_item(ecx: &encode_ctxt, ...) {
     let icx = @{ecx: ecx, ...};
     ...
     some_helper_func(icx, ...);
     ...
 }

 fn some_helper_func(icx: @inline_ctxt, ...) {
     // ... can use icx, icx.ecx, etc ...
 }

In this case, there is not really much reason to do this, as the lifetime of the inline_ctxt is bound to the stack frame that created it. But it can be convenient in a number of scenarios:

  • a long computation might make use of internal data that can be collected before the computation itself completes, and this internal data may need to contain references;
  • allocating values that you plan to return to your caller is most conveniently done with an @ pointer.

This last point is interesting. Basically, in most of our examples we’ve been allocating things on the stack—but you can’t return stuff that’s on your stack up to your caller, clearly (and if you try, in Rust at least, you’ll find that a type error results).

Arenas

One very common C trick for speeding up allocation is to make use of memory pools, also called arenas. If you happen to have a lot of allocations which you plan to do but which will all get freed at one point, then you can allocate a big block of memory and just hand it out piece by piece. Once the pass is done, you free the memory all at once. The key is that you never track whether an individual allocation has completed or not, so you avoid a lot of overhead. The problem with arenas is that, as typically implemented, they are unsafe, because you might free the arena but still hold on to pointers that point into the arena. This is where lifetimes come in.

Using a reference, we can allocate memory in arenas and be sure that the reference will not outlive the arena itself. For example, this function will allocate a new point in an arena and return it:

fn alloc_point(pool: &arena) -> &point {
    ret new (pool) { x: 3u, y: 4u };
}

In this case, the type checker will assign the allocated point the same lifetime as the arena itself. So the point can be used so long as the arena is valid.

Lifetimes

At this point, I’ve shown you a lot of examples of how references can be used, but I have given basically no intution for how it is that the compiler can prevent a reference from being used when it is no longer valid.

The basic idea is that every reference type &T is in fact shorthand for a type written &a.T, where a is some kind of lifetime. The lifetime of a reference defines when it is valid. These lifetimes correspond to the dynamic execution of some function, block, expression, whatever.

To make this clearer, let’s look at an example. Suppose I have this simple function. I have also shown the various lifetimes (named ac) graphically along the right-hand side.

fn scoped_lifetimes(x: @uint) { // a
    let y = 3u;                 // |
                                // |
    if cond {                   // | b
        let z = 4u;             // | |
                                // | | c
        borrow(x) /* 1 */       // | | |
                                // | | -
    }                           // | -
}                               // -

fn borrow(x: &uint) {...}

There are three distinct lifetimes in the function scoped_lifetimes(), each nested within one another. The outermost one is a, which corresponds to the entire function activation. The expression &y, which takes the address of the local variable y, would have type &a.uint.

The next lifetime is b, which corresponds to the “then-block” of the if statement. The expression &z would have the type &b.uint, because after the if statement concludes the variable z is no longer in scope.

Finally, the lifetime c corresponds just to the call to borrow(x). Here, the variable x is coerced into a region pointer with lifetime &c.uint.

Now let’s examine borrow() a bit more closely. The definition of borrow is in fact shorthand for something like the following:

                         //  d
                         //  .
                         //  .
fn borrow(x: &d.uint) {  //  |
    ...                  //  |
}                        //  |
                         //  .
                         //  -

In other words, the &uint type we saw before in fact expands to a lifetime with a unique name; we’ll call this name d (in fact, all uses of & within the types of a function’s parameters or its return type are references to a special region called the anonymous region—it acts just like a named region, except that it doesn’t have a name).

The lifetime d is a bit different from the other lifetimes we’ve seen, as it appears within the function declaration itself: it is in fact a lifetime parameter. That is, it corresponds to some lifetime which the caller will specify—the callee, borrow() in this case, doesn’t know precisely how long the lifetime d lasts, it only knows that d includes the entire execution of the callee. I’ve tried to depict this in my ASCII art diagram using dots to represent the unknown duration, with the pipes | representing what is known for certain. In the call to borrow(x) which we saw before, the lifetime parameter d would be mapped to the lifetime c from scoped_lifetimes().

Detecting errors

The compiler uses these symbolic lifetimes to prevent problems. Consider something simple like this:

fn give_away() -> &uint {
    let y = 3u;
    ret &y;
}

Here there is an error because the function is attempted to return a pointer into its own stack frame. To see how the compiler detects this, consider the lifetimes involved:

                            // a
                            // .
                            // .
fn give_away() -> &a.uint { // | b
    let y = 3u;             // | |
    ret &y;                 // | |
}                           // | -
                            // .
                            // -

Here I have called the anonymous lifetime parameter a. The expression &y has type &b.uint, which does not match the expected type &a.uint, and so we get a type error. This type error is warning us that the lifetime of the pointer we are trying to return (b) is shorter than the lifetime which was declared (a).

Ta ta for now

There’s more to tell, but I’ll stop here, as this post is already plenty long.

Vectors, Strings, and Slices

We’ve been discussing a lot about how to manage vectors and strings in Rust. Graydon sent out an excellent proposal which allows for a great number of use cases to be elegant handled. However, I find the syntax somewhat misleading. I’ve proposed one alternative on the mailing list, but I now find I don’t like it, so I thought I’d brainstorm a bit and try to find something better.

There are really three use cases:

  • vectors and strings, which are either allocated on the task heap (@) or exchange heap (~);
  • slices, which are a cheap, stack-bound way to represent subvecs and substrings;
  • fixed-length vectors, which are mainly for C compatibility.

In this post I’m going to focus on the first two cases. The last use case (fixed-length vectors) is, I think, quite distinct from the first two, and we should separate it out. I have also omitted one use case which Graydon’s proposal included: fixed-length strings. I don’t think that the type str/10 is of much use, as it refers to a by-value string that is always 10 characters long. This is not like a fixed-length buffer that can hold strings up to 10 characters long. Rather, as I understand it, it can only store strings of exactly 10 characters. How often are we likely to want that?

Representation

The representation of a vector or string is something like:

struct rust_vec<T> {
    int fill;    // How many bytes are used
    int alloc;   // How many bytes are allocated
    T   data[0]; // Inline data
};

Note in particular that this structure does not have a fixed size. Rather, it will vary depending on how many items are present. This indirection is efficient but causes us a bit of trouble.

The representation of a slice which Graydon proposed is something like this:

struct slice<T> {
    T *data;
    int length;
};

Basically, the pair of a pointer and a length of memory.

As an aside, Fixed-length vectors, have yet a third representation: T[N]. In other words, just a C-like vector. So you can see that slices, “vectors as a whole”, and fixed-length vectors are quite different things to the compiler.

Proposal the first

One idea might be something like this:

Proposed   Graydon   Representation
[T]        [T]       slice<T>
vec<T>               rust_vec<T>
@vec<T>    [T]/@     rust_box<rust_vec<T>>*
~vec<T>    [T]/~     rust_vec<T>*

substr     str       slice<char>
str                  rust_vec<char>
@str       str/@     rust_box<rust_vec<char>>*
~str       str/~     rust_vec<char>*

The literal forms would basically stay the same as they are today. So [x1, x2] has the type vec<T> and "foo" has the type str.

I have intentionally drawn a big distinction between the type of a slice ([T]) and the type of a vector (vec<T>). I think these things are similar but different and people might be easily confused if the notation is too similar.

There are two types here that cannot be expressed in the original system: vec<T> and str. There is a good reason that these types are inexpressible: they do not have a fixed size. So, allowing them as types introduces a certain danger. For example, a function like the following could not be compiled:

 fn foo(x: @vec<T>) {
     let y = *x;
     ...
 }

After all, the size of the stack frame could not be correctly calculated, it would depend on how much data was in x. It is particularly annoying to deal with this situation due to the possibility of writing generic functions like

 fn gen_foo<U>(x: @U) {
     let y = *x;
     ...
 }

There are two solutions to this, which are really the same solution in different guises. The simplest solution is to say that type variables cannot be bound to the types vec<T> and str. This would prevent us from calling gen_foo() with a vector or a string. We’d also have some kind of special treatment around assignments so that let x = [1, 2, 3] ends up with a slice, I guess. Have to think a bit about that.

Alternatively, one could have a bound that indicates data of a known size. This kind would be required to manipulate instances of T by value. But this could rapidly become annoying. You might prefer to have the default be that types do have a known type and you have to say when the type variable might not. This is also ok although generally type bounds enable operators, not disable them.

Both solutions are somewhat annoying and I know that Graydon was trying to avoid them in his design.

Proposal the second

If we wanted to avoid the possibility of types whose size is not known, then we have to take a different tack. We can’t have the @ be a prefix anymore. I’d still rather it come near the front of the type, and not tacked on the end. So far, my preferred notation for this is something like the following:

Proposed   Graydon   Representation
[]T        [T]       slice<T>
[@]T       [T]/@     rust_box<rust_vec<T>>*
[~]T       [T]/~     rust_vec<T>*

""         str       slice<char>
"@"        str/@     rust_box<rust_vec<char>>*
"~"        str/~     rust_vec<char>*

Yes, that’s right, I just proposed using "" as the way to write the type for strings. Pretty wacky, I know. But it seems like we need a type name that has two parts, a begin and an end, so that we can stick the @ and ~ inside of them. An alternative might be to use more words (str vs tstr vs ustr or something).

The literal forms for vectors could be something like [@|...] and [~|...]. I don’t know about strings.

What about fixed-length vectors?

I don’t know. We could do [N]T, but then it kind of looks like a slice. I personally lean towards something like T * N. We also need an expression form. To be honest, I don’t care about this that much, it seems like macros could solve it too.

Summary…

None of these ideas seem perfect. I’m mostly tossing them out there to ensure we keep talking about it.

p.s., I just realized as I read over the post that I forgot about mutability. Something like [mut T] cannot be written as vec<mut T>, at least not today. Sigh. Well, I’ll post this blog post anyway. As I said, nothing here is perfect, just wanted to capture some of the things I’ve been thinking about.

On Types and Type Schemes

After my recent dalliance in Matters of a Truly Trivial Nature, I’d like to return to Matters Most Deep and Profound. I’m running up against an interesting question with regions that has to do with the nature of function types like fn(&int): up until now, I’ve assumed that this refers to a function that takes an integer pointer in some region that is specified by the caller. That is, it is a kind of shorthand for a type that might be written like fn<r>(&r.int), where the <r> indicates that the function type is parameterized by the region r.

But first, a digression on types and type schemes…

This notation is analogous to a generic function, like:

fn identity<T>(t: T) -> T { ret t; }

However, there is an important distinction. In Rust, as in ML, parameterization only occurs on named items. So, you can have a named function identity defined generically, but you cannot have a type fn<T>(T) -> T.

This is an interesting and subtle point. In fact, all Rust types are monotypes, meaning types that refer to exactly one thing. Now, it may not be known precisely what that thing is, but there must be a name for it. So, the type of the t parameter is T, which is a type variable. This is a monotype, it can only refer to one thing: the type T. It just so happens, however, that we do not know when typechecking identity what that type T is.

The type of the identity function itself, however, cannot be represented as a monotype. We cannot name a specific type for its parameter and return value, it could safely be used with any type. To accommodate this concept, ML introduced the idea of a type scheme, also called a polytype. (at least by Wikipedia, I’ve never heard the term before. but it seems logical.)

A type scheme is basically a type along with a set of bound type variables. A bound type variable is one that is defined within the scheme itself. So, if you have the type scheme fn<T>(T) -> T, then the variable T is said to be bound in this scheme. In a scheme like fn<T>(T, U) -> T, the variable T is bound, but the variable U is called free, as it is not defined in the scheme. Note that in a monotype all variables are free, as monotypes do not define any type variables.

So, in a way, a like fn<r>(&r.T) is really a monotype, although it does not bind type variables. But it can still refer to many concrete types. This entails complexity.

…and now back to regions.

So, the question is, should region variables be bound or free within a function type? It certainly makes life simpler if they are always free, and it still results in a fairly expressive system.

But first let’s examine why bound regions make life complex. To help keep things clear, I will use the explicit “bound region” notation I introduced earlier, even though it’s not an actual Rust type, and I will eschew anonymous regions. This means that the notation for writing function types and so forth will be a bit heavier than it would be in “real life”.

I will use a few conventions: lowercase letters early in the alphabet like a, b, and c refer to bound regions. Lowercase letters late in the alphabet (r, s) refer to free regions. Plus, I generally drop the types of a region pointer if they are not important, so let &r be shorthand for something like &r.int.

Subtyping of bound regions

Renaming of bound regions. Imagine a type A=fn<a>(&a) and a type B=fn<b>(&b). Is A a subtype of B? Clearly, the answer should be yes: they are basically the same type, as you could rename the bound variable a in A to b, and they would be precisely the same. So here we find that we have to consider possible renamings of bound regions when considering subtyping.

Instantiating bound regions. OK, well, now imagine the type C=fn(&r). Here, the region variable r is not bound but rather free. So in this case, is A a subtype of C? I would argue that the answer should be yes: after all, if you instantiate A with the value a for r, you get fn(&r.T), the same as C. So we ought to consider possible instantiations of bound variables as well.

Coallescing bound regions. Finally, one more example:

fn<a,b>(&a, &b)     <:    fn<c>(&c, &c)

Here the subtype is more flexible than the supertype. The subtype accepts two region pointers in any two regions, but the supertype requires that they be in the same region.

…with type variables, too

Now, just to make things more fun, imagine we throw in a type variable X into the mix. Here we play the role of the inference engine, which is trying to find a value for X. So the question becomes, is there any type that I can assign to X which would make the subtype relation true?

Referring to free regions. Let’s start with a simple example. Consider:

fn(&r)    <:    fn(X)

In this case, r is free, so we can assign X the value of &r and everything should be fine.

Bound regions. Ok, what if the subtype refers to a bound region?

fn<a>(&a)    <:    fn(X)

We can still handle this case, but it requires a region variable as well. In other words, if we create a region variable R, then we can substitute that region variable for a and obtain:

fn(&R) <: fn(X)

Now we can assign X to &R and then assume that the inference engine will find a suitable region for R. This will be based on constraints from the rest of the program.

Multiple parameters. Of course, if there are multiple parameters, there may be interactions between them:

fn<a>(&a, &a)    <:    fn(X, &r)

In this case, because r appears free in the supertype, X can be assigned &r. That would mean that a can be instantiated with r and the subtyping relation holds.

Bound regions within the supertype. What if the region in the supertype is bound, not free?

fn<a>(&a, &a)    <:    fn<c>(X, &r)

In this case, there is no value of X which is suitable. This is perhaps not obvious: you might think that &c would be a fine value for X. But that means that the value of X would refer to the region c, which is bound within the type. It’s a scoping violation. The name c has no meaning outside of the supertype, whereas the type X (which appears free) does have meaning. So X cannot refer to regions bound within the supertype.

Woah.

Yeah, it’s complex. I haven’t come up with an elegant implementation for the inference engine that accommodates all of these scenarios. One option is to not handle all of these cases. I also ought to read up in The Literature as well as the implementations of other languages (e.g., Haskell, Scala) to see what they do in similar scenarios. Still, I dislike the idea of having things in our type system that require citations to explain.

So, can we just drop bound type variables in function types?

Actually, I think we definitely could (not yet sure if we should). Most things I’ve thought of will “just work”, and when they won’t, there is workaround via interfaces (more on that later).

First, an example of something that works:

fn iter(v: [T], f: fn(&T)) { 
    uint::range(0, v.len()) { |i|
        f(&v[i]);
    }
}

This function iterates over each item in the slice v and invokes the function f (I am assuming Graydon’s work on slices and vectors is complete). If we fully expand this type to see all the regions involved, you end up with:

fn iter(v: [T]/&a, f: fn/&a(&a.T)) { ... }

This signature is probably a bit confusing. As usual, I find the best way to think about regions is as lifetimes (in fact, I am considering changing my terminology over to use the word lifetime exclusively). So what this notation means is that there is some span of time a in which the vector data and the function closure is valid. The function itself expects a pointer which is also valid for this same span of time (in this case, that pointer will be a pointer into the vector contents, so its lifetime comes from there). This span of time a will generally be the call to iter() itself.

What doesn’t work?

Basically, what doesn’t work is when you want to have a function whose arguments can have lifetimes whose lifetime is not yet known. This most commonly occurs when functions are stored into records. One example that comes to mind is the hash and eq functions that we use to implement hashtables right now.

Currently, our hashtables are defined with a structure something like:

type hash<K,V> = {
    hashfn: fn(&K) -> uint,
    eqfn: fn(&K, &K) -> bool,
    ...
};

Here you see that the hashfn takes a pointer to the key K and returns the hash (a uint). The eqfn takes two keys and returns a boolean if they are equal.

The key point here is that the lifetimes for the key arguments are not known and cannot be known in advance. The data for the hashtable is stored in a structure on the heap and so its lifetime is not stack-based and hence has no region; for any given hashtable operation, the current array will be borrowed and thus tied to the stack of that particular operation, but these future operations cannot be given a name.

Did you say something about a workaround?

Yes, we can use ifaces to work around this problem. Imagine an iface:

iface hash_key_ops<K> {
    fn hash(k: &K) -> uint;
    fn eq(k1: &K, k2: &K) -> bool;
}

I am mildly abusing ifaces here because the “self” would not be a particular key but rather some singleton object representing the hash function itself. For example, I might define:

enum murmur_hash { murmur_hash };
impl of hash_key_ops<str> for murmur_hash {
    fn hash(k: &str) -> uint { ... }
    fn eq(k1: &str, k2: &str) -> bool { k1 == k2 }
}

Now we could define the hashtable like:

type hash<K,V> = {
    ops: hash_key_ops/@<K>,
    ...
};

fn new_hash<K,V>(ops: hash_key_ops/@<K>) { ... }

Now whenever we want to hash a key we can invoke tbl.ops.hash(key). The key point is that the named functions in an iface, just like function items, can have polytypes even though normal function types are monotypes. Then each time we invoke hash() we would instantiate the bound regions with fresh region variables.

Of course, if we were going to use ifaces with hashtables, we might rather define the iface over the key type itself. That raises some interesting issues about instance coherence which I plan to discuss in a blog post Real Soon Now, but if you’re curious about that you may also want to read my mailing list post on the topic as it is still my preferred solution.