au icon indicating copy to clipboard operation
au copied to clipboard

Add support for fmtlib

Open chiphogg opened this issue 2 years ago • 2 comments

We have a formatter internal to Aurora, which was pretty easy to write. However, I don't think it was that smart about taking the unit label length into account. Maybe we can write some test cases for an improved implementation.

chiphogg avatar Jul 12 '23 02:07 chiphogg

Any chance of this being ported to the public library?

catskul avatar Mar 05 '25 15:03 catskul

Yes, this issue tracker is only for things that we intend for the public library. 🙂 We have another internal tracker for Aurora-internal issues.

It's a very busy time at Aurora right now with our product launch imminent, so we can't work on Au much at all right now. That said, I think we could fit it in for the next release we do get a chance to do, so I'll bump it up to the 0.4.2 milestone. Thanks for asking!

chiphogg avatar Mar 05 '25 16:03 chiphogg

Update: a flurry of activity, with a frustrating result. I've already spent at least 3x as much time as I expected when I put this on the 0.4.2 (now 0.5.0) milestone. We even got to the point of landing what we thought was a working implementation! But now I don't even know if this is feasible. At the very least, I'll probably revert what we have for the 0.5.0 release, and hope that we can figure something out in time for 0.6.0.

The tricky part is that we need to maintain Au's zero-dependency status, and we need a solution that works for a wide variety of {fmt} versions, as well as std::format. Those are some wickedly hard constraints, as far as I can (now) tell.

It's not all bad news. I feel pretty satisfied with the format string syntax we came up with (see #491). And I have a much better sense of the shape of the problem and its constraints.

I think I'll write up a detailed retrospective on what happened. The main audience is "future me, trying to solve this for 0.6.0", but it may be of interest to others yearning for this feature. Maybe I'll follow up by taking stock of where we're at.

Strategies and attempts

Initial idea: forward declare and specialize

If this had worked, it would have been the best end user experience: anyone who uses fmt::format or std::format would get Au's format string syntax for free, without any extra work. The idea was to forward declare the fmt::formatter class, and provide a specialization for au::Quantity<U, R>.

Note how this avoids the physical dependency on {fmt}. Without {fmt}, it'd just be some random template specialization. But as soon as a user includes "fmt/format.h", the specialization "comes to life", and they automatically get Au's format string syntax.

The problem is that it's literally impossible to forward declare fmt::formatter, because it's not really fmt::formatter. It's fmt::v11:formatter in today's version of the library, where v11 is an inline namespace. There are plenty of other versions floating around in the wild, such as fmt::v10::formatter, fmt::v9::formatter, etc. And the only way to forward declare an entity is exactly as declared (including anonymous namespace), so this really is a hard blocker.

This has two consequences.

  1. Zero-setup zero-dependency solutions are impossible for {fmt}. (Note that they could be possible with other libraries in principle, but the design of {fmt} precludes it forever.)

  2. Since the formatter specialization must occur in a separate file, we risk ODR violations.

This latter point would be pretty bad: if some translation units use Au's formatter specialization, and others use the generic one, then the program is IFNDR (ill-formed, no diagnostic required). Thankfully, {fmt} version 9.0 turned the generic version into a hard compiler error, which greatly reduces the risk of ODR violations. So while this will be a pain, it's not a showstopper.

Next: provide au::QuantityFormatter, have 1-line setup

The next best thing we could do is to provide the full implementation of the formatter, and give users a 1-line setup that just uses that logic. In their project, they could create a file something like this:

#pragma once

#include "au/au.hh"
#include "fmt/format.h"

namespace fmt {
template <typename U, typename R>
struct formatter<::au::Quantity<U, R>> : ::au::QuantityFormatter<U, R, ::fmt::formatter> {};
}  // namespace fmt

Then they could include this file to get high quality formatting for Quantity. And if they try to format Quantity without including the file, then they'll get a hard compiler error as long as they're on {fmt} version 9.0 or later (which we strongly encourage).

In principle, the most appealing implementation strategy for our formatter is to hold two other formatters as member variables: one for the numeric value, and one for the unit label. Note how in the above code, we pass ::fmt::formatter as a template template parameter: this is what enables ::au::QuantityFormatter to declare the two formatter member variables.

fmt::formatter has two key member functions. parse() parses the format string to decide how to format, and format() formats the actual variable itself. The goal is to identify which substrings of the format string correspond to the numeric value formatter and unit label formatter, and call the parse() from the corresponding formatter on that substring. Indeed, this is what we did in #491, and it seemed to work really well. It's also the "officially" suggested strategy for this kind of composite formatter.

The problem is that parse() doesn't just take a substring; it takes a "format parse context" object. I tried constructing this in one way, and the tests passed, so I shipped it. But branching out to std::format showed that I got the constructor parameters wrong. I tried fixing it in #492, but a frustrating/fortunate coincidence showed me my folly. When I tried downgrading the {fmt} version in our tests from 11.2.0 to 11.0.2, the constructor signature became wildly different, with 4 parameters instead of 2. It seems there's just no reliable way to build a context from the "substring" so that I can delegate to the appopriate formatter member.

The upside is that this forced me to confront the fact that I was relying on internal implementation details of the library, a clear red flag. It's why I'm convinced that no matter what we end up doing, the current implementation will have to go. I'm greatly relieved this never made it into an official release.

I still hold out hope for this strategy. Holding formatter members and delegating the parse function would give us the friendliest user setup, and the most natural and efficient implementation. It's also suggested by the library author (in the comment linked above) as the right way to go, although admittedly this is only a comment on an issue, and the contents of the function are only hand-wavy comments. I have filed fmtlib/fmt#4508 to track this, and I hope for a good outcome. But I can't count on it in the near term, so I have to explore other ideas.

Latest: use format strings

Unless and until we get a way to construct parse calls using only publicly supported API elements, the most promising pathway to a robust (i.e., "public API elements only") solution would be to extract format strings, and make calls to format_to().

This does complicate the end user setup, because now we need to ask them to tell us how to call format_to(), since we can't take a physical dependency on the library, and we can't forward declare it. The new setup is a little ugly, but probably workable:

#pragma once

#include "au/au.hh"
#include "fmt/format.h"

namespace fmt {
struct FmtFormatterImpl {
    template <typename OutIter, typename... Args>
    static auto format_to(OutIter out, const char *fmt_str, Args &&...args) {
        return fmt::format_to(out, fmt_str, std::forward<Args>(args)...);
    }
};

template <typename U, typename R>
struct formatter<::au::Quantity<U, R>>
    : ::au::QuantityFormatter<U, R, ::fmt::formatter, FmtFormatterImpl> {};
}  // namespace fmt

Overall, this appears to work in many cases (see #494). However, it fails in C++20 mode. As far as I can tell, {fmt} takes advantage of consteval in C++20 mode and later, and the const char *fmt_str that we have constructed fails to meet the bar.

It's frustrating, because it feels like this fallback should be very easy in principle, but even this I can't get to work reliably.

Current status

Supporting {fmt} in Au just needs more bake time. We've learned a lot more about the constraints we're dealing with, and how much harder they are than we had thought. It's now virtually certain to be removed from 0.5.0, which we'll begin the process of finalizing over roughly the next week (?) or so.

But we still made tangible progress. The format string syntax is now fully mature, and I'm happy with it. So what needs to happen next is one of the following three things:

  1. We learn how to robustly pass a substring to the parse() method of another formatter (fmtlib/fmt#4508), and we provide a 1-line setup for end users.

  2. We figure out how to get the format string implementation working in C++20 mode, and we provide a slightly more complicated setup for end users.

  3. Somebody comes up with another idea.

These are roughly in order of preference. (Obviously, I don't know what option 3 actually entails, but I assume it's some kind of hideous macro hackery.)

The hard constraints in all of this are:

  1. Zero dependencies.

  2. No usage of internal implementation details of {fmt}.

  3. Support a wide variety of {fmt} versions, as well as std::format.

  4. Minimal and easy setup for end users (soft constraint).

Thanks for your patience everyone!

chiphogg avatar Aug 01 '25 15:08 chiphogg

This comment reflects my latest thinking on what implementation approach we will want to take, based on feedback from @vitaut. Dropping the link here so that I'll have it handy when I'm ready to pick this back up again.

chiphogg avatar Aug 10 '25 17:08 chiphogg