This blog post talked about the "self written C++ standard library" I wrote for the fun of it (code here).
The post got linked by Hackernews and Reddit. As is usual the majority of comments did not talk about the actual content but instead were focused on two tangential things. The first one being "this is not a full implementation of the C++ standard library as specified by the ISO standard, therefore the author is an idiot". I am, in actual fact, an idiot, but not due to project scope but because I assumed people on the Internet to have elementary reading comprehension skills. To make things clear: no, this is not an implementation of the ISO standard library. At no point was such a thing claimed. There is little point in writing one of those, there are several high quality implementations available. "Standard library" in this context means "a collection of low level functions and types that would be needed by most applications".
The second discussion was around the fact that calling the C++ standard library "STL" was both wrong and proof that the person saying that does not understand the C++ language. This was followed by several "I am a C++ standard library implementer and everyone I know calls it the STL". Things deteriorated from there.
The complexity question
Existing container implementations are complex by necessity. They need to handle things like types that can not be noexcept moved or copied. The amount of rollback code needed explodes very quickly and needs to be processed every time the header is included (because of templates). A reasonable point against writing your own containers is that eventually you need to support all the same complexity as existing ones because people will insert "bad" types into your container and complain when things fail. Thus you need to have all the same nasty, complex and brittle code as an STL implementation, only lacking decades of hardening to shake out all the bugs.
That is true, but the beauty of having zero existing users is that you can do something like this:
This is basically a concept that requires the type given to be noexcept-movable (and some other things that could arguably be removed). Now, instead of allowing any type under the sun, all containers require all types they get instantiated with to be WellBehaved. This means that the library does not have to care at all about types behaving badly because trying to use those results in a compiler error. A massive amount of complexity is thus eliminated in a single stroke.
There are of course cases when you need to deal with types that are not well behaved. If you can tolerate a memory allocation per object, the simple solution is to use unique_ptr. OTOH if you have types that can't be reliably moved and which are so performance critical that you can't afford memory allocations, then you are probably going to write your own container code anyway. If you are in the position that you have badly behaved types that you can't update, can't tolerate an allocation and can't write your own containers, then that is your problem.
Iterating things the native way
One of the most common things I do with strings in Python is to split them on whitespace. In Python this is simple as there is only one string type, but in native code things are more complicated. What should the return type of mystring.split() be?
- vector<string>
- vector<string_view>
- A coroutine handle
- The method should be a full template so you can customize it to output any custom string type (as every C++ code base has at least three of them)
- Something ranges something something
So given that you have a callback function like this:
the split function becomes:
This is as fully general as you can get without needing to muck about with templates, coroutines or monads (I don't actually know what monads are, I'm just counting on the fact that mentioning them makes you look cool). The cost is that the end user needs to write a small adapter lambda to use it.
Iterating things the Python way
The Python iteration protocol is surprisingly simple. The object being iterated needs to provide a .next() method that either returns the next value or throws a StopIteration exception. Obviously exceptions can't be used in native code because they are too slow, but the same effect can be achieved by returning an optional<T> instead. Here's a unit test for an implementation of Python's range function.
Sadly there is not a simple way to integrate this to native loop constructs without macros and even then it is a bit ugly.
Current status
The repo has basic functionality for strings (regular and enforced UTF-8), regexes and basic containers.
Building the entire project has 8 individual compilation and linking steps and takes 0.8 seconds when using a single core on this laptop computer. Thus compiling a single source file with optimizations enabled takes ~ 0.1 seconds. Which is nice.
The only cheat is that pystd uses ctre for regexes and it is precompiled.