Gabriel Dos Reis had a presentation on C++ modules at cppcon. The video is available
here. I highly recommend that you watch it, it's good stuff.
However as a build system developer, one thing caught my eye. The way modules are used (at around 40 minutes in the presentation) has a nasty quirk. The basic approach is that you have a source file foo.cpp, which defines a module Foobar. To compile this you should say this:
cl -c /module foo.cxx
The causes the compiler to output foo.o as well as Foobar.ifc, which contains the binary definition of the module. To use this you would compile a second source file like this:
cl -c baz.cpp /module:reference Foobar.ifc
This is basically the same way that Fortran does its modules and it suffers from the same problem which makes life miserable for build system developers.
There are two main reasons. One: the name of the ifc file can not be known beforehand without scanning the contents of source files. The second is that you can't know what filename to give to the second command line without scanning it to see what imports it uses _and_ scanning potentially every source file in your project to find out what file actually provides it.
Most modern build systems work in two phases. First you parse the build definition and determine how and which order to do individual build steps in. Basically it just serialises the dependency DAG to disk. The second phase loads the DAG, checks its status and takes all the steps necessary to bring the build up to date.
The first phase of the two takes a lot more effort and is usually much slower than the second part. A typical ratio for a medium project is that first phase takes roughly ten seconds of CPU time and the second step takes a fraction of a second. In contemporary C++ almost all code changes only require rerunning the second step, whereas changing build config (adding new targets etc) requires doing the first step as well.
This is caused by the fact that output files and their names are fully knowable without looking at the contents of the source files. With the proposed scheme this no longer is the case. A simple (if slightly pathological) example should clarify the issue.
Suppose you have file A that defines a module and file B that uses it. You compile A first and then B. Now change the source code so that the module definition goes to B and A uses it. How would a build system know that it needs to compile B first and only then A?
The answer is that without scanning the contents of A and B before running the compiler this is impossible. This means that to get reliable builds either all build systems need to grow a full C++ parser or all C++ compilers must grow a full build system. Neither of these is particularly desirable. Even if build systems got these parsers they would need to reparse the source of all changed files before starting the compiler and it would need to change the compiler arguments to use. This makes every rebuild take the slow path of step one instead of the fast step two.
Potential solutions
There are a few possible solutions to this problems, none of which are perfect. The first is the requirement that module Foo must be defined in a source file Foo.cpp. This makes everything deterministic again but has that iffy Java feeling about it. The second option is to define the module in a "header-like" file rather than in source code. Thus a foo.h file would become foo.ifc and the compiler could pick that up automatically instead of the .h file.