Format API improvements

It’s been a while since my last post. Two important things happened in the meantime to the fmt project: revision 0 of the formatting paper has been submitted to the standards committee and version 4 of the library released (thanks to Jonathan Müller for putting the new release together). In this post I will describe the recent work in the std branch of the library that is an implementation of the standards proposal.

string_view support

Methods now take format string as a string_view instead of a null-terminated string (that used to be wrapped in a now extinct cstring_view):

This was one of the most frequent pieces of feedback to the proposal, see, for example, this discussion on std-proposals. Surprisingly noone ever mentioned it as a library feature. For compatibility reasons the library provides it’s own implementation of basic_string_view, that will eventually be replaced with the standard one.

Seemingly simple, switching to string_view showed that parsing is way easier when using sentinels (duh) which I ended up doing by introducing a new iterator type that simulates having a sentinel at the end of the input.

This change also added a small overhead on the call site due to passing of the size in addition to the string pointer but the performance impact was negligible according to preliminary benchmarks.

Separation of parsing and formatting

A more important change that required a lot of refactoring was separation of parsing and formatting in the extension API. It is in part based on one of the many suggestions by Bengt Gustafsson in the std-proposals mailing list and related to the idea of precompilation of a format string which is hardly new.

First, what is the extension API I’m talking about? It’s the interface that allows you to add formatting support for your types or, rather, objects of these types. For example, in iostreams you define an operator<< for this purpose:

In addition to formatting, the fmt library gives you control over parsing of format specs. In the current proposal it is done via the format_value extension point:

where buf is an output buffer and ctx is a formatting context that provides access to the current position in the format string and other arguments. The latter is necessary for implementing features like dynamic width and precision.

While simple, this approach mixes parsing and formatting which is not always desirable. The new approach separates the two concerns and replaces format_value with the formatter struct as an extension point:

This is still experimental and will likely require more work but it already allows interesting use cases that weren’t possible before, such as precompilation of a format string when formatting a sequence of values:

Note that there is no parse method in the above example because it’s inherited from the base class, formatter<T>. Moreover, the format specifiers are parsed once and reused when formatting each element.

Moving parsing to a separate method also opens the possibility of compile-time format string checks, but that’s a topic for another blog post.