Monday, May 15, 2017

Emulating the Rust borrow checker with C++ part II: the borrowining

The most perceptive among you might have noticed that the last blog post did not actually do any borrowing, only single owner semantics with moves. Let's fix that. But first a caveat.

As far as I can tell, it is not possible to emulate Rust's borrow checker fully in compile time with C++. You can get pretty close (for some features at least) but there is some runtime overhead and violations are detected only at runtime, not compile time. See the end of this post for a description why that is. Still, it's better than nothing.

At the core is a resource whose ownership we want to track. We either want to allow either one accessor that can alter the object or many read-only accessors. We put the item inside a class that looks like this:

template<typename T>
class Owner final {
 private:
    T i;
    int roBorrows = 0;
    bool rwBorrow = false;
<rest of class omitted for brevity>

The holder object does not give out pointers or references to the underlying object. Instead you can only request either a read only or read-write (or mutable) reference to it. The read only function looks like this (the rw version is almost identical):

template<typename T>
RoBorrow<T> Owner<T>::borrowReadOnly() {
    if(rwBorrow) {
        throw std::runtime_error("Tried to create read only borrow when a rw borrow exists.");
    } else {
        return ::RoBorrow<T>(this); // increments borrow count
    }
}

Creating read only borrows only succeeds if there are no rw borrows. This code throws an exception but it could also call std::terminate or something else. A rw borrow would fail in a similar way if there are any ro borrows.

The interesting piece is the implementation of the proxy object RoBorrow<T>. It is the kind of move-only type that was described in part I. When it goes out of scope its destructor decrements the owner's borrow count:

~RoBorrow() { if (o) { o->decrementRoBorrows();} }

The magic lies in the conversion operator that is slightly different than the original version:

operator const T&() const { return owner->i; }

The conversion operator only gives out a const reference to the underlying object, which means only const operations can be invoked on it. Obviously this does not work for e.g. raw file descriptors because a "const int" does not protect the stat of the kernel fd object.  In order to call non-const functions you first need to get an RwBorrow whose conversion operator gives a constless reference, and Owner will only provide one if there are no other borrows outstanding.

When using this mechanism the following code fails (at runtime):

 auto b1 = owner.borrowReadOnly();
 auto b2 = owner.borrowReadWrite();
 
But this code works:

{
    auto b1 = owner.borrowReadOnly();
    auto b2 = owner.borrowReadOnly();
}
{
    auto b3 = owner.borrowReadWrite();
}
{
    auto b4 = owner.borrowReadOnly();
}

because the borrows go out of scope and are destroyed decrementing borrow counts to zero.

Why can't this be done at compile time?

I'm not a TMP specialist so maybe it can but as far as I know it is not possible. I'd love to be proven wrong on this one. The issue is due to limitations of constexpr which can be distilled into this dummy function:

template<typename T>
void constexpr foo(boolean b) {
    T x;
    constexpr int refcount = 1;
    if constexpr(b) {
        // do something
        --refcount;
    } else {
        // do something else
        --refcount;
    }
    static_assert(refcount == 0);
}

The static assert is obviously true no matter which code path is executed but the compiler can't prove that. First of all the if clause does not compile because its argument is not a compile-time constant. Second of all, the refcount decrements do not compile because constexpr variables can not be mutated during evaluation. You can create a new variable with the value x-1 but it would go out of scope when the subblocks end and there is no phi-node -like concept to get the value out to the outer scope. Finally, destructors can not be constexpr so even if we could mutate the count variable, it would not be done automatically (even though in theory the compiler has all the information it needs to do that).

2 comments:

  1. In Rust that's a RefCell. What you would use if you can't statically prove to the compiler that what you're doing is correct

    ReplyDelete
  2. Check https://www.reddit.com/r/rust/comments/6btzio/an_implementation_of_borrowck_in_c_by_foonathan/ for something that seems to be a compile-time implementation.

    ReplyDelete