This post introduces an approach to memory safety that I believe is more practical and more ergonomic than the available alternatives.
It all started way back when, and was inspired by things I read and wrote:
Three years of development later, I believe I finally got it. The proposal is complete.
Moreover, I have implemented it in my own programming language I intend to release soon-ish, and I want to share the design decisions and the entire path from “huh, why not” to “omg it’s live”.
So, TL;DR: linear types (which are dropped exactly once) + abstract interpretation + a bunch of tricks allows us to eliminate the same classes of bugs as Rust does (at least in non-concurrent environments) plus memory leaks, and we can extend the approach to also cover concurrent environments, all the while being more ergonomic and less restrictive.
Sounds fun? Let’s dig in.
It is safe - it completely eliminates entire classes of bugs, such as:
Single ownership enables linearity - each value is dropped exactly once - and prohibits ownership cycles. Together with the flow-sensitive type system built to enforce it, these eliminate most of the above. Buffer overflows and OOB accesses are covered separately, but the mechanics of the rest of the system make dealing with these easy and efficient.
It is sound - I will demonstrate over the course of this series that the claims hold for arbitrary inputs. There are no holes that can be used to break the guarantees provided from inside the system.
It is NOT simple - there is a fairly large number of primitives working together so that the whole system can uphold the safety guarantees promised.
It is NOT concerned with concurrency - though the “fearless concurrency” guarantees are a natural extension to the proposed system, it has not been implemented in a complete enough way to demonstrate the viability of the approach. I will expand on this in a future post once I get it up and running.
It is NOT claiming to be “zero cost”, though it keeps runtime overhead to the minimum - it introduces runtime checks (a single branch per indeterminate access) if the compiler cannot statically prove availability.
Consider this pseudocode:
var x: T = new T;
if random() > 0.5 {
drop x;
}
print(x);What this code does is it conditionally consumes a value.
There are two ways this could go in a real language. C++ doesn’t particularly care and will happily compile this code:
#include <cstdlib>
#include <cstdio>
int main() {
int *i = new int(42);
if ((double)rand() / RAND_MAX > 0.5) {
delete i;
}
printf("i=%d\n", *i);
return 0;
}Which will then proceed to invoke UB in about 50% of runs. A modern C++ developer would reach for std::unique_ptr and std::optional here - and they would help, partially. RAII via smart pointers eliminates the manual delete, and optional gives you a way to represent “maybe moved.” But unique_ptr only manages heap-allocated objects, and the type system does not enforce the optional check - operator* on an empty optional is undefined behavior, and even .value() only gives you a runtime exception instead of a compile-time error. It is still on you to remember.
In Rust, though, this code does not compile at all:
fn main() {
let x = Box::new(42);
if rand::random::<f64>() > 0.5 {
drop(x);
}
println!("{}", x); // error[E0382]: borrow of moved value: `x`
}Rust takes a very different approach. The compiler tracks moves through control flow - it sees that x might have been moved in the if branch, and rejects the program outright. Rust’s ownership model requires that every variable’s move state is statically known at every point in the program - a conditionally-moved value violates that requirement, so the program is rejected. You can wrap the value in Option<T> yourself and .take() it manually, but Rust won’t do that for you - the burden is on the developer to restructure the code upfront.
So, what if there was a third way between these two?
The proposed solution is straightforward:
var x: T = new T;
if rand() > 0.5f {
drop x;
}
// <- At this point, typeof(x) is Option<T>The type of the value is now control-flow-dependent - the compiler evaluates it as it goes through the program, widening it each time control flow diverges to accommodate for both possibilities. Then it becomes the developer responsibility to narrow it down when they want to use it:
var x: T = new T;
if rand() > 0.5f {
drop x;
} // x is _widened_ to an Option<T>
if x {
// x is definitely available in this branch and can be used
} else {
// x is definitely not available
} // x is _widened_ to an Option<T> againOne way to view this is to consider which information is available at the compiler at various points:
x, which is expressed by the type system as widening type of x to Option<T>x has a definite availabilityCompared to C++ approach, we now force the developer to consider the state space explicitly and avoid the crash, because the typechecker will catch all attempts to use an Option<T> where a T should be used, or to use a definitely non-available value.
Compared to Rust approach, we gain flexibility at a cost of a runtime check - a single null/tag comparison at the point of refinement.
It should be noted that this is not a new idea. If anything, one of the most popular languages in the world, TypeScript, does exactly that. However, TypeScript compiles to JavaScript - a language with garbage collection and shared ownership, which does not concern itself with lifetimes, memory or resource management issues, or concurrency, which are all something I need to cover.
This is just the tip of the iceberg, the very beginning of the system. When we cover more ground - aggregates, references, function calls, dynamic dispatch, lambdas and closures - it will grow to accommodate new requirements.
One more thing that I also wanted is that each value is guaranteed to be dropped exactly once. This is known as linear typing. When linear typing is discussed, the guarantee is usually formulated as “used exactly once”, but what constitutes a use can vary. In my case, use == drop.
With the proposed approach it becomes trivially simple:
T, it is being dropped when it is either going out of scope or its owner is going out of scope, or manually - which transitions its type to NoneOption<T> where T is an owned type - the compiler inserts a runtime check and a conditional drop instead at the end of the scope, but these can also be dropped manuallyThis raises a question about the refined branch:
var x: T = new T;
if rand() > 0.5f {
drop x;
} // x is _widened_ to an Option<T>
if x {
// x is definitely available in this branch and can be used,
// but what type is it?
} else {
// x is None
} // x is _widened_ to an Option<T> againWhich type, exactly, will x have when it is refined to something available in the second conditional statement?
If it’s refined to T, it will be dropped at the end of the scope of the if branch. That would be silly - a given value could only ever be refined once, used, then immediately dropped when the branch ends. Safe, but draconian.
Instead, we define a new type - Some<T> - whose only purpose is to avoid being dropped, or to serve as proof of availability without taking ownership.
This is one of the many types that are handled by the type checker in a special way; namely:
Option<T> - constructing it from a T would move ownership in, and since Some<T> is not auto-dropped, the value would leakOption<T> - if the value contained in the unwrapped Option<T> is somehow dropped, the refined view of that option should not be usable. I will cover dependencies and how they help us in a later postOption<T> invalidates the Some<T> - the dependency is bidirectionalT. The developer can explicitly drop it - this consumes the underlying value and sets the original Option<T> to NoneAs I said - definitely not simple. But not that complicated either.
The approach described above has roots in several well-established areas of programming language theory.
Flow-sensitive typing allows the type of a variable to change as the program executes. Most type systems are flow-insensitive - a variable declared as T stays T for its entire scope. Flow-sensitive systems, such as TypeScript’s control flow narrowing, track how types evolve along different execution paths. What we add is applying this to ownership - the availability of a value is part of its type, and that availability changes as the program flows through moves, drops, and conditional branches.
Refinement types allow types to be narrowed by predicates. When we write if x { ... }, we are refining the type of x from Option<T> to Some<T> (or to None in the else branch). This is a direct application of refinement typing - the conditional acts as a proof that the value is available, and the type system reflects that proof.
There are several parts of the system that link values and their types during compilation - Some<T> depends on the Option<T> it was refined from, and as we will see in later posts, references depend on the values they point to. This is related to dependent typing in the limited sense that type validity depends on specific program values and ownership relationships. The system does not attempt full dependent typing in the Idris or Agda sense, but it does track value-to-type dependencies across function boundaries and through control flow.
Abstract interpretation provides the unifying framework. What the compiler does is interpret the program abstractly in the availability domain - instead of computing actual values, it computes whether each variable is definitely available, definitely unavailable, or indeterminate. Branch joins widen the state, and conditionals narrow it. This is a standard abstract interpretation over a simple lattice: T (available) and None (unavailable) are the precise states, Option<T> is their join.
It is worth noting that availability is not the only domain the compiler interprets in. Later posts will introduce additional domains - each with its own lattice - interpreted over the same control flow structure. Dependency tracking, reference validity, and ownership of aggregates all follow the same abstract-interpretation approach.
I will maintain a running list of the rules, invariants, and behaviors of the system as we go. Each post will add to it. This list may seem ad-hoc and chaotic because there is no single syntactic trick that everything falls out of - there are several underlying principles playing together, which generate many small rules.
I warned you at the beginning that this is not simple.
Here’s the thing, though: you have to uphold these rules for safety anyway - in C++ you do it in your head, in Rust you do it by fighting the borrow checker. All we do here is mechanically shift the responsibility from the developer to the compiler. Each rule is grounded in well-established theory - we just apply it in a way that doesn’t require the developer to think about it.
Option<T>Option<T> is itself an owned type when T is ownedOption<T> can be refined to Some<T> or None by a conditional checkSome<T> is non-owning - refinement does not transfer ownershipSome<T> obtained by refining Option<T> and the source Option<T> depend on each otherSo this is the end of the beginning, and there’s much, much more left to cover.
In the next post (UPD: available here) I will generalize this approach to aggregates such as records and arrays, and introduce references to allow sharing values without transferring ownership.