Fast integer to string conversion in C++

An updated version of this post is available here: Converting a hundred million integers to strings per second.

In this post I compare the performance of several methods of integer to string conversion in C++:

To measure the performance I used a benchmark from Boost Karma. This benchmark generates 10,000,000 random integers and converts them to strings using different methods measuring conversion time. I’ve replaced nonportable itoa with sprintf and added several other methods.

Apart from adding new conversion methods, I’ve also noticed that the benchmark used unnecessary conversion to std::string in some tests to compensate for string operations in the other. To get more useful results, I’ve split every such test in two, one that does conversion to std::string and one that doesn’t. Tests that do unnecessary conversion to std::string have suffix +std::string. They are suboptimal, but I’ve included them for completeness.

Here are the results ordered by the time it took a method to convert 10,000,000 integers to strings (obviously smaller is better); time ratio is the ratio of conversion time to the best time:

I consider these results pretty exciting. First they show that fmt::format_int is the fastest of the tested methods, about 24% faster than cppx::decimal_from, the next contender, and whopping 30x (not 30%) faster than boost::format. Here’s the code to convert an integer to a string with fmt::format_int:

fmt::format_int f(42);
// The result can be converted to std::string using f.str() or
// accessed as a C string using f.c_str().
auto s = f.c_str(); // s == "42"

Note that fmt::format_int automatically manages the output buffer unlike sprintf and karma::generate which require an error-prone manual memory management. In the case of karma::generate you can probably use an output iterator such as back_insert_iterator for automatic memory management but the performance will likely suffer.

Another remarkable and surprising (to me) observation is that sprintf is not particularly good for integer formatting. It is more than 6 times slower than fmt::format_int. One possible reason for this is that sprintf parses the format string, but so do fmt::format and fmt::format_to which are 1.8 - 2.6 times faster than sprintf. The good thing is that you don’t have to use sprintf in an attempt to sacrifice safety for performance. There are much faster or at least equally slow but safer methods.

One recent addition to the benchmark is fmt::compile which does constexpr format string compilation. As can be seen from the results fmt::compile + fmt::format_to are almost as fast as an artisanal integer-to-string converter optimized by hand (cppx::decimal_from). Thanks Louis Dionne for the idea and Hana Dusíková for the proof-of-concept implementation of format string compilation.

The benchmark results were obtained on macOS Mojave with Apple LLVM version 10.0.1 (clang-1001.0.46.4) and the following compiler flags: -O3 -DNDEBUG.

Running the benchmark:

$ git clone --recursive https://github.com/fmtlib/format-benchmark.git
$ cd format-benchmark
$ cmake .
$ make
$ ./int-generator-test.py

You can find out more about fmt::format_int and fmt::format in the {fmt} library repository on GitHub and in the documentation.

Update: Since I don’t have ltoa on my platform, I’ve added a basic public-domain implementation of this function from here. Let me know in the comment section if there is a better version available somewhere.

Update 2: Added decimal_from function suggested by Alf P. Steinbach. As sprintf and ltoa it requires a user-provided buffer.

Update 3: Inspired by a lesson learned from Alexandrescu’s talk that “no work is less work than some work” I’ve come up with a faster method of integer to string conversion. Unlike other methods it does one pass over the digits. All other methods I know do two passes and can be divided into two categories:

  1. Count digits (pass 1), then convert digits to chars writing from the end of the buffer (pass 2).
  2. Convert digits to chars writing from the beginning of the buffer (pass 1). Reverse the string in the buffer (pass 2).

Instead of doing this, I just convert digits to chars writing from the end ofi the buffer and return the pointer to the start of the converted string. In most cases there is some space left in the beginning of the buffer, but that’s fine because the same is true for the second category of methods above, they just have this space at the end of the buffer. This avoids unnecessary copying within a buffer that is often discarded anyway.

I’ve implemented this method in the fmt::format_int class available in the {fmt} library.

Update 4:

Added side effects to make sure that the code being tested is not optimized away by a super clever compiler (I wish there existed one). This is implemented by computing a sum of lengths of all formatted strings using strlen. The strlen function is used even in cases where std::string::size could be used to make sure the same extra computation is done for all methods. Note that since this adds a more or less constant factor to all the methods, high performers are penalized more.

Update 5:

Fixed links to the {fmt} library.

Update 6 (25 Nov 2019):

Added fmt::compile which does constexpr format string compilation and updated the test results.