Rust Borrow and Lifetimes
Rust is a new programming language under active development toward 1.0. I might write another blog about Rust and why I think it’s great, but today I’ll just focus on its borrow and lifetimes system, which has stumped many Rust newcomers including myself. This post assumes you have some basic understanding of Rust. If not yet, you may want to read its Guide and Pointer Guide first.
EDIT (Dec-2): Added 8 scope charts for code samples.
Resource ownership and borrow
Rust achieves memory safety without GC by using a sophiscated borrow system. For any resource (stack memory, heap memory, file handle and so on), there is exactly one owner which takes care of its resource deallocation, if needed. You may create new bindings to refer to the resource using &
or &mut
, which is called a borrow or mutable borrow. The compiler ensures all owners and borrowers behave correctly.
Copy and move
Before we jump into the borrow system, we should know how Rust handles copy and move. This SO answer is a great read. Basically, in assignments and function calls:
- If the value is copyable (only involving primitive types, no resources e.g. memory or file handle involved), the compiler defaults to copy.
- Otherwise, the compiler moves (transfers) the ownership and invalidates the original binding.
In short, pod (plain old data) => copy, non-pod (linear types) => move.
Here are a few additional notes for your reference:
- Rust copy is like C. Every by-value use of a value is a byte copy (shallow memcpy copy) instead of a semantic copy or clone.
- To make a pod struct non-copyable, you may use a NoCopy marker field, or implement Drop trait.
After a move, the ownership is transferred to the next owner.
Resource deallocation
Rust frees any resource as soon as its ownership disappears, that is, when:
- the owner goes out of scope, or
- the owning binding changes (thus the original binding becomes void).
Owner’s and borrower’s privileges and restrictions
This section is based on the Rust Guide, with a mention of copy and move in the privileges part.
An owner has some privileges. It may:
- control resource deallocation,
- lend the resource, immutably (multiple borrows) or mutably (exclusive), and
- hand over the ownership (with a move).
An owner has some restrictions, too:
- During a borrow, the owner may NOT (a) mutate the resource, or (b) mutably lend the resource.
- During a mutable borrow, the owner may NOT (a) access the resource, or (b) lend the resource.
A borrower has some privileges, too. In addition to accessing or mutating the borrowed resource, a borrower may also share the borrow further:
- A borrower may share (copy) an immutable borrow.
- A mutable borrower may hand over (move) the mutable borrow. (Note that mutable reference is moved by default.)
Code samples
Enough talk. Let’s see some code. (You may run Rust code at play.rust-lang.org.) In all examples below, we’ll use struct Foo
which is non-copyable because it contains a boxed (heap-allocated) value. Using non-copyable resources makes the operations more restrictive, which is a good thing for learning.
For every code sample, we also provide a “scope chart” to illustrate the scopes of owner, borrowers etc. The curly braces in the header line match the curly braces in the code.
Owner cannot access resource during a mutable borrow
This code wouldn’t compile if we uncomment the last println!
line:
struct Foo {
f: Box<int>,
}
fn main() {
let mut a = Foo { f: box 0 };
// mutable borrow
let x = &mut a;
// error: cannot borrow `a.f` as immutable because `a` is also borrowed as mutable
// println!("{}", a.f);
}
{ a x * }
owner a |_____|
borrower x |___| x = &mut a
access a.f | error
It violates owner’s restriction #2(a). If we put let x = &mut a;
in a nested block, the borrow ends before the println!
line and this would work:
fn main() {
let mut a = Foo { f: box 0 };
{
// mutable borrow
let x = &mut a;
// mutable borrow ends here
}
println!("{}", a.f);
}
{ a { x } * }
owner a |_________|
borrower x |_| x = &mut a
access a.f | OK
Borrower can move the mutable borrow to a new borrower
This code shows the borrower’s privilege #2: mutable borrower x
can hand over (move) the mutable borrow to a new borrower y
.
fn main() {
let mut a = Foo { f: box 0 };
// mutable borrow
let x = &mut a;
// move the mutable borrow to new borrower y
let y = x;
// error: use of moved value: `x.f`
// println!("{}", x.f);
}
{ a x y * }
owner a |_______|
borrower x |_| x = &mut a
borrower y |___| y = x
access x.f | error
After the move, the original borrower x
can no longer access the borrowed resource.
Borrow scope
Things start getting interesting if we pass the references (&
and &mut
) around, and that’s where many Rust newcomers’ confusions begin.
Lifetime
In the whole borrow story, it’s really important to know where a borrower’s borrow starts and ends. In the Lifetimes Guide, it’s called a lifetime:
A lifetime is a static approximation of the span of execution during which the pointer is valid: it always corresponds to some expression or block within the program.
However, I would like to use the term borrow scope to describe the scope where the borrow is effective. Note that it actually differs from the lifetime definition above. (I first saw that term in a Rust RFC discussion, though my definition may differ.) I will give reasons why I avoid using lifetimes later. For now, let’s just put lifetimes aside.
& = borrow
A few things about borrow:
Firstly, just remember &
= borrow and &mut
= mutable borrow. Wherever you see an &
, there is a borrow.
Secondly, when an &
shows up in any struct (in its field) or function/closure (in its return type or captured references), the struct/function/closure is a borrower, and all borrow rules apply.
Thirdly, for every borrow, there is exactly an owner and a single or multiple borrowers.
Borrow scope extension
A few things about borrow scope:
Firstly, a borrow scope:
- is the scope where the borrow is effective, and
- is not necessarily the lexical scope of the initial borrower, because the borrower can extend the borrow scope (see below).
Secondly, a borrower can extend the borrow scope through a copy (immutable borrow) or move (mutable borrow) that takes place in assignments or function calls. The receiver (can be a new binding, struct, function or closure) then becomes a new borrower.
Thirdly, a borrow scope is the union of all borrowers’ scopes, and the borrowed resource must be valid through the whole borrow scope.
Borrow formula
From the last point, we have this borrow formula:
resource scope >= borrow scope = union of all borrowers’ scopes
Code sample
Let’s see some example of borrow scope extension. The struct Foo
is the same as before:
fn main() {
let mut a = Foo { f: box 0 };
let y: &Foo;
if false {
// borrow
let x = &a;
// share the borrow with new borrower y, hence extend the borrow scope
y = x;
}
// error: cannot assign to `a.f` because it is borrowed
// a.f = box 1;
}
{ a { x y } * }
resource a |___________|
borrower x |___| x = &a
borrower y |_____| y = x
borrow scope |=======|
mutate a.f | error
Even though the borrow happens inside the if
block and the borrower x
goes out of scope after the if
block, it has extended the borrow scope through an assignment y = x;
, so there are two borrowers: x
and y
. According to the borrow formula, the borrow scope is the union of borrower x
and borrower y
’s scopes, which ranges from the first borrow let x = &a;
through the end of the main
block. (Note that the binding y
is not a borrower before the y = x;
line.)
You might have noticed that the if
block will never get executed since the condition is always false
, but the compiler still forbids the resource owner a
to access its resource. This is because all the borrow checking happens at compile-time, nothing to do with the program runtime execution.
Borrowing multiple resources
So far, we only focus on the borrow of a single resource. Can a borrower borrow multiple resources? Of course! For example, a function may take two references and returning one of them depending on certain criteria, e.g. which one has a larger value in its field:
fn max(x: &Foo, y: &Foo) -> &Foo
The max
function returns an &
pointer, hence it is a borrower. The return result can be from either input parameter, so it is borrowing two resources.
Named borrow scope
When there are multiple &
pointers as inputs, we need to specify their relationship using named lifetimes as defined in the Lifetimes Guide. But for now, let’s just call them named borrow scopes.
The above code wouldn’t be accepted by the compiler without specifying the relationship between borrowers, i.e. which borrowers are grouped in which borrow scope. The following implementation is valid:
fn max<'a>(x: &'a Foo, y: &'a Foo) -> &'a Foo {
if x.f > y.f { x } else { y }
}
(All resources and borrowers are grouped in borrow scope 'a.)
max( { } )
resource *x <-------------->
resource *y <-------------->
borrow scope 'a <==============>
borrower x |___|
borrower y |___|
return value |___| pass to the caller
In this function, we have one borrow scope 'a
and three borrowers: the two input parameters, and the function return result. The aforementioned borrow formula still applies, but now every borrowed resource must satisfy the formula. See the example below.
Code sample
In the following code, let’s use the above max
function to pick up the bigger Foo
between a
and b
:
fn main() {
let a = Foo { f: box 1 };
let y: &Foo;
if false {
let b = Foo { f: box 0 };
let x = max(&a, &b);
// error: `b` does not live long enough
// y = x;
}
}
{ a { b x ( ) y } }
resource a |________________| pass
resource b |__________| fail
borrow scope |==========|
temp borrower |_| &a
temp borrower |_| &b
borrower x |________| x = max(&a, &b)
borrower y |___| y = x
Until let x = max(&a, &b);
, things are fine because &a
and &b
are temporary references which are valid only in the expression, and the third borrower x
borrows the two resources (either a
or b
but to the borrow checker, it borrows both) till the end of the if
block, so the borrow scope is from let x = max(&a, &b);
to the end of the if
block. Both resources a
and b
are valid through the whole borrow scope, hence satisfying the borrow formula.
Now if we uncomment the last assignment y = x;
, y
becomes the fourth borrower, and the borrow scope is extended to the end of the main
block, causing resource b
to fail the test of the formula.
Struct as a borrower
In addition to functions and closures, a struct can also borrow multiple resources by storing multiple references in its field(s). We’ll see some examples below and how the borrow formula applies. Let’s use this Link
struct to store a reference (an immutable borrow):
struct Link<'a> {
link: &'a Foo,
}
Struct to borrow multiple resources
Even with only one field, struct Link
can borrow multiple resources:
fn main() {
let a = Foo { f: box 0 };
let mut x = Link { link: &a };
if false {
let b = Foo { f: box 1 };
// error: `b` does not live long enough
// x.link = &b;
}
}
{ a x { b * } }
resource a |___________| pass
resource b |___| fail
borrow scope |=========|
borrower x |_________| x.link = &a
borrower x |___| x.link = &b
In the above example, borrower x
is borrowing resource from owner a
, and the borrow scope is till the end of the main
block. So far so good. If we uncomment the last assignment x.link = &b;
, x
is also trying to borrow resource from owner b
, which would make resource b
to fail the test of the borrow formula.
Function to extend borrow scope without a return value
A function without a return value can also extend the borrow scope through its input parameters. For example, this function store_foo
takes a mutable reference of Link
, and stores a reference (immutable borrow) of Foo
in it:
fn store_foo<'a>(x: &mut Link<'a>, y: &'a Foo) {
x.link = y;
}
In the following code, the resource owned by a
is the borrowed resource; the Link
struct mutably referenced by x
is the borrower (i.e. *x
is the borrower); the borrow scope is till the end of the main
block.
fn main() {
let a = Foo { f: box 0 };
let x = &mut Link { link: &a };
if false {
let b = Foo { f: box 1 };
// store_foo(x, &b);
}
}
{ a x { b * } }
resource a |___________| pass
resource b |___| fail
borrow scope |=========|
borrower *x |_________| x.link = &a
borrower *x |___| x.link = &b
If we uncomment the last function call store_foo(x, &b);
, the function will try to store &b
to x.link
, making resource b
another borrowed resource and failing the test of the borrow formula, since resource b
’s scope does not cover the whole borrow scope.
Multiple borrow scopes
It is possible to have multiple named borrow scopes in a function. For example:
fn superstore_foo<'a, 'b>(x: &mut Link<'a>, y: &'a Foo,
x2: &mut Link<'b>, y2: &'b Foo) {
x.link = y;
x2.link = y2;
}
In this (probably not very useful) function, two disjointed borrow scopes are involved. Each borrow scope would have its own borrow formula to satisfy.
Why lifetime is confusing
Lastly, I want to explain why I think the term lifetime used by Rust’s borrow system is confusing (and I thus avoid using it in this blog post).
When we talk about borrow, there are three different kinds of “lifetime” involved:
- A: the lifetime of the resource owner (or the owned/borrowed resource)
- B: the “lifetime” of the whole borrow, i.e. from the first borrow to the last return
- C: the lifetime of an individual borrower or borrowed pointer
When one says “lifetime”, it can refer to any of the above. If multiple resources and borrowers are involved, things get even more confusing. For example, what does a “named lifetime” refer to in the declaration of a function or struct? Does it mean A, B or C?
In our previous max
function:
fn max<'a>(x: &'a Foo, y: &'a Foo) -> &'a Foo {
if x.f > y.f { x } else { y }
}
What does lifetime 'a
mean here? It shouldn’t be A, because two resources are involved and they have different lifetimes. It cannot be C, because there are three borrowers: x
, y
and the function return value, and they all have different lifetimes, too. Does it mean B? Probably. But the whole borrow scope is not a concrete object, how can it have a “lifetime”? Calling it lifetime is just confusing.
Some may say it means the minimal lifetime requirements to the borrowed resources’ lifetimes. That makes sense in some way, but how can we call the minimal lifetime requirements “a lifetime”?
The ownership/borrow concept itself is already complicated. The confusion that the term “lifetime” brings makes learning the concept even more baffling, I would say.
P.S. Using the A, B and C defined above, the borrow formula becomes:
A >= B = C1 U C2 U … U Cn
Learning Rust is worth your time!
Although the borrow and ownership thing may take you a while to grok, it’s an interesting learn. Rust tries to achieve memory safety without GC, and it’s doing pretty well so far. Some people say learning Haskell changes the way you program. I think learning Rust is worth your time, too.
Hope this blog post provides a little help.