Taken from fmtlog (emphasis mine)

The results on a linux server with "Intel(R) Xeon(R) Gold 6144 CPU @ 3.50GHz" is:

Message	fmtlog	Nanolog	spdlog
staticString	6.4 ns, 7.08 M/s	6.5 ns, 33.10 M/s	156.4 ns, 6.37 M/s
stringConcat	6.4 ns, 6.05 M/s	7.5 ns, 14.20 M/s	209.4 ns, 4.77 M/s
singleInteger	6.3 ns, 6.22 M/s	6.5 ns, 50.29 M/s	202.3 ns, 4.94 M/s
twoIntegers	6.4 ns, 4.87 M/s	6.6 ns, 39.25 M/s	257.2 ns, 3.89 M/s
singleDouble	6.2 ns, 5.37 M/s	6.5 ns, 39.62 M/s	225.0 ns, 4.44 M/s
complexFormat	6.4 ns, 2.95 M/s	6.7 ns, 24.30 M/s	390.9 ns, 2.56 M/s

Note that the throughput of Nanolog is not comparable here because it outputs to binary log file instead of human-readable text format, e.g. it saves an int64 timestamp instead of a long formatted date time string.

How can fmtlog achieve such low and stable latency? Two key optimization techniques are employed inspired by Nanolog:

One is allocating a single producer single consumer queue for each logging thread, and have the background thread polling for all these queues. This avoids threads contention and performance will not deteriorate when thread number increases. The queue is automatically created on the first log msg of a thread, so queue is not created for threads that don't use fmtlog. The thread queue has a default size of 1 MB(can be changed by macro FMTLOG_QUEUE_SIZE), and it takes a little time to allocate the queue. It's recommended that user actively calls fmt::preallocate() once the thread is created, so even the first log can have low latency.

What happens when the queue is full? By default, fmtlog simply dump addtional log msgs and return. Alternatively, front-end logging can be blocked while the queue is full by defining macro FMTLOG_BLOCK=1, then no log will be missing. User can register a callback function when log queue is full by fmtlog::setLogQFullCB(cb, userData), by which user can be aware that the consumer(polling thread) is not keeping up. Normally, the queue being full is seldom a problem, but incautious user could leave log statements that are invoked in an unexpected high frequency, e.g. a tcp client spamming with "Connection refused" errors without a connection retry delay. To handle this problem in an elegant way, fmtlog provides a log macro which limits frequency of this log: FMTLOG_LIMIT and 4 shortcuts logil, logwl and logel respectively, user needs to pass the mininum interval in nanosecond as the first argument, e.g.

logil(1e9, "this log will be displayed at most once per second");

The other optimization is that static information of a log(such as format string, log level and location) is saved in a table at its first call, and fmtlog simply pushes the index of the static info table entry with dynamic arguments to the queue, minimizing the msg size. In addition, fmtlog defines a decoding function for each log statment, which is invoked in fmtlog::poll()when the log msg is popped from the queue.

However, these decoding functions bloat program size with each function consuming around 50 bytes. In addition, the static infomation entry also consumes 50-ish bytes runtime memory for each log statement. Such memory overhead may not be worthwhile for those infrequent and latency insensitive logs(e.g. program initialization info), thus fmtlog provides user with another log macro which disables this optimization: FMTLOG_ONCE and of couse shortcuts: logdo, logio, logwoand logeo. FMTLOG_ONCE will not create a static info table entry, nor add a decoding function: it pushes static info along with formatted msg body onto the queue. Note that passing argument by pointer is not supported by `FMTLOG_ONCE`.

For those who prefer to further optimize memory usage by filtering log at compile time, macro FMTLOG_ACTIVE_LEVEL is applied with a default value FMTLOG_LEVEL_INF, meaning debug logs will simply be discarded at compile time. Note that FMTLOG_ACTIVE_LEVEL only applies to log shortcut macros, e.g. logi, but not FMTLOG. Similarly, runtime log level filtering can be disabled by defining macro FMTLOG_NO_CHECK_LEVEL, which will increase performance and reduce generated code size a bit.

Is there a way in `spdlog` to achieve similar speed? Or at least access the same set of features.

Nov 05 '23 12:11 Theramar

spdlog async is much slower than synchronous spdlog. It is interesting to compare to the synchronous version of spdlog.

Nov 05 '23 12:11 gabime

Surely it would be nice however IIRC spdlog does not offer an binary encoding/decoding (see Nanolog) mechanism which can increase performance

Nov 05 '23 14:11 Theramar

@gabime the benchmark shows synchronous spdlog, see MengRao/fmtlog#84

Nov 06 '23 13:11 Theramar

any updates?

Nov 08 '23 19:11 Theramar

I am not familiar with how fmtlog works, but it does not appear to support enhancements by developers. spdlog is extensible in that developers can define subclasses of sink. Instead, there is the overhead of calling virtual functions at runtime, which can have a significant impact on print speed (and thus limits compile-time optimization).

Most of fmtlog's optimizations are not likely to be imported into spdlog, but a PR that improves print speed would be welcome.

Nov 08 '23 21:11 tt4g

seems fmtlog compare with sync spdlog while itself use async mode inside. So that is unfair

Nov 29 '23 02:11 hanzz2007

spdlog
spdlog copied to clipboard

faster print time (< 10ns)

Is there a way in `spdlog` to achieve similar speed? Or at least access the same set of features.

spdlog spdlog copied to clipboard

faster print time (< 10ns)

Is there a way in spdlog to achieve similar speed? Or at least access the same set of features.

spdlog
spdlog copied to clipboard

Is there a way in `spdlog` to achieve similar speed? Or at least access the same set of features.