In one of the previous posts I’ve introduced C++ Format, a new formatting library for C++ and briefly described its API on a few examples. In this post I’ll compare its performance with other libraries and discuss some design aspects that make it fast.
To measure performance I used a benchmark by Chris Foster, the author of tinyformat library. I changed the benchmark to include my new formatting library and ran it three times taking the best time for each method:
The instructions on running the tests are available here so you can easily reproduce the results.
As you can see, the C++ Format library is only slightly slower then
both methods are considerably faster than the alternatives. Boost Format
is exceptionally slow, almost 8 times slower than the C++ Format library.
Poor performance of Boost Format is also confirmed
here, see section
Now that we see that the C++ Format library is almost as fast as plain old
printf, let’s see how it is achieved without compromising safety
One of the main things that affects performance is dynamic memory allocation. As shown in this article Boost Format makes lots of allocations (from 16 to 41 depending on a compiler) for a simple formatting operation. FastFormat library does from 1 to 3 allocations for the same example. The C++ Format library can avoid any allocation if the number of format arguments is less or equal to 10 and the output fits into a 500 character buffer. These limits are tentative and can be easily adjusted if necessary.
To avoid memory allocation it is necessary that all the formatting arguments were provided at once. Consider the following example:
As you can see, Boost Format allows passing arguments in different statements.
I think that the combination of this feature with positional arguments support
is the main reason why Boost Format is so slow. Notice that we pass a
std::string as the first argument. This temporary is destroyed
at the end of the full expression in which it appears. Boost Format has two
options: either save the argument and format it later or format it immediately
according to all the specifications for this argument in the format string
and store the output somewhere. Both alternatives are very inefficient.
This particular example may look artificial because there is no need to construct a temporary string here, but a temporary string may be returned by some function instead.
Printf on the other hand takes all formatting arguments at once, but it
only support built-in types:
Printf has the advantage that it doesn’t have to worry about the lifetime
of the arguments, so there is no need to copy them.
So it is clear that for the formatting to be efficient all the arguments
should be supplied at once. But how can it be done if the arguments are
passed using an overloaded operator? I’ve found the solution to this problem
in Clang’s source code, namely
The main idea is that a formatting function returns a temporary object that
accepts arguments through overloaded operator
<< and the formatting is
performed in the destructor of this temporary object.
Here is an annotated example that uses the C++ Format library:
In this case the formatting happens before the temporary string is destroyed. (There is one more detail that makes it possible, but I’ll discuss it in a different post.) So there is no need to copy arguments, it is enough to store references or pointers to them and use when necessary avoiding dynamic memory allocations. This is the main thing that makes format so fast.
Another thing that allows C++ Format to avoid dynamic allocations in many cases is
a special array data structure optimized for small size. If the number of
elements is smaller than some number defined at compile time this data
structure stores them in a fixed sized array in the object itself.
It uses dynamic allocation for larger sizes and can grow as
The rest is just careful implementation and avoiding unnecessary work.
For example, my initial implementation used
snprintf for all built-in
types which was inefficient because it required constructing a new format
string for each argument that had to be parsed by
snprintf. This was
a lot of extra work, so the new implementation formats integers, strings
and characters itself and only uses
snprintf to format floating-point
numbers. It might be possible to get additional improvement by using
dtoa function written by my colleague
David Gay instead of
I’ve done profiling of the speed test executable used in the benchmark with gperftools and here are the results:
The second column gives percentage of profiling samples in this function and
the fifth gives percentage of profiling samples in this function and its
callees. As can be seen from the profile about 70% of samples point to
FormatInt which does integer
and pointer formatting looks pretty efficient.
DoFormat does most of the
job parsing the format string, copying the literal parts and formatting
strings and characters.
When I started the project I was not sure if it is possible to implement a
formatting library as fast as
printf without compromising on type safety,
support for user-defined types and positional arguments. It turned out that it
is possible with the right design and careful implementation.