Sunday, March 4, 2018

Compiling Cargo crates natively with Meson

Recently we have been having discussions about how Rust and Meson should work together, especially for mixed language projects. One thing which multiple people have told me (over a time span of several years, actually) is that Rust is Special in that everyone uses crates for everything. Thus there is no point in having any sort of Rust support, the only true way is to blindly call Cargo and have it do everything exactly the way it wants to.

This seems like a reasonable recommendation so I did what every reasonable person would do and accepted this as is.

David Addison wearing cardboard x-ray goggles with caption "This is me being completely reasonable".

But then curiosity takes hold of you and you start to wonder. Is that really the case?

Converting Cargo manifests to Meson projects

The basic setup of a Cargo project is fairly straightforward. The file format is mostly declarative and most crates are simple libraries that have a few source files and a bunch of other crates they link against. Meson has the same primitives, so could they be automatically converted.

It turns out that for simple examples you can. Here is a sample repo that downloads the Itoa crate from github, converts the Cargo build definition into a Meson project and builds it as a subproject (with Meson, not Cargo) of the main project that uses it. This prototype turned out to require 71 lines of Python.

What about dependencies other than itoa?

The script currently only works for itoa, because crates.io does not seem to provide a web API for queries and the entire site is created with JavaScript so you can't even do web scraping easily. To get this working properly the only thing you'd need is a function to get from (crate name, version) to the git repo.

What about dependencies of dependencies?

They can be easily grabbed from Cargo.toml file. Combined with the above they could be downloaded and converted in the same fashion.

What doesn't work?

A lot. Unit tests are not built nor run, but they could be added fairly easily. This would require adding compile options so the actual source could be built with the unittest flags. This is some amount of work but Meson already supports a similar feature for D so adding it should not be a huge amount of work. Similarly docs are not generated.

What about build.rs?

Cargo provides a fairly simple project model and everything more complex should be handled by writing a build.rs program that does everything else necessary. This suffers from the same disadvantages as every Turing complete build system ever has, and these scripts are not in general possible to convert automatically.

However based on documentation the common case seems to be to call into external build tools to build dependency libraries in other languages. In a build system that builds both parts at the same time it would be possible to create a better UX for this (but again would obviously not be something you can convert automatically).

Could this actually work in practice with real world projects?

It might. It might not. Ignoring the previous segment no immediate showstopper has presented itself thus far. It might in the future. But you never know until you try.

5 comments:

  1. I'm not sure, but for crates.io API did you mean this https://crates.io/api/v1/crates ?

    ReplyDelete
  2. After checking, it's https://crates.io/api/v1/crates/{name}/{version}/download

    ReplyDelete
  3. Thanks, but that seems to download some gzipped binary blob. You can use it to extract the needed information but an API that just gives out the metadata as a json object would be more useful.

    ReplyDelete
  4. If you're looking for raw JSON data about a package, you want the endpoint https://crates.io/api/v1/crates/iota. Then I think the "dl_path" variable in there gives you a way to download the actual .crate sources.

    If you do this instead of the repository, this looks like it could be a pretty cool integration! I'd download the sources from crates.io instead because the github repository likely contains additional updates, and it isn't required to be specified.

    re: Rust is Special: I definitely think that we could benefit from more integration and not just blindly calling cargo! However there is a lot of stuff cargo lets us do, and duplicating that is a lot of effort. If you're willing to take that on, awesome!

    I'd definitely recommend running build scripts if you can as part of the build step. They're allowed to do a lot of different stuff, and most of the time they're essential to the library. If a library uses build.rs for code generation, then that code just won't be there to compile if you don't run build.rs.

    Interpreting / translating build.rs would mean building a full rust interpreter, which is... hard. I mean build.rs scripts can also depend on full crates from crates.io, which can depend on other crates, etc. It seems much less effort to actually run it than to try to find some way to *not* run it but still get the same result.

    ReplyDelete
    Replies
    1. > f you're looking for raw JSON data about a package, you want the endpoint https://crates.io/api/v1/crates/iota.

      That seems to have all the info I need, thanks. But for some reason it does not have the data for the "itoa" crate even though it does for a bunch of others I tried.

      > e: Rust is Special: I definitely think that we could benefit from more integration and not just blindly calling cargo! However there is a lot of stuff cargo lets us do, and duplicating that is a lot of effort. If you're willing to take that on, awesome!

      I am a build system developer. This is the sort of thing I'm _supposed_ to do.

      > I'd definitely recommend running build scripts if you can as part of the build step. They're allowed to do a lot of different stuff, and most of the time they're essential to the library. If a library uses build.rs for code generation, then that code just won't be there to compile if you don't run build.rs.

      I don't have first hand usage experience with crates but based on documentation and a few projects I skimmed the most common reasons for using build.rs are compiling C libraries and source generation. Cargo's design seems to have been to not cater for these use cases but instead have each project reimplement them on their own.

      Meson's design is the exact opposite. Building dependencies in any supported language and source generation are both operations that we support fully as first class citizens.

      > It seems much less effort to actually run it than to try to find some way to *not* run it but still get the same result.

      Well yes and no. If the assumption above is true (which, again, I can't say from personal experience) then in Meson the entire script would be unnecessary because we provide builtin functionality for the most common operations.

      Because of Turing completeness and all that converting build.rs's into Meson form is not automatically and reliably doable. However if Cargo provided primitives that behave the same then the conversion would be straightforward.

      Delete