logging-log4j2 Clean up thread context map implementations

We bring some order in the current variety of ThreadContextMap implementations by:

replacing DefaultThreadContextMap with the more performant StringArrayThreadContextMap introduced in #2330,
removing the CopyOnWriteSortedArrayThreadContextMap, since it has no advantages over the default one. It doesn't even guarantee garbage-free logging.
moving the GarbageFreeSortedArrayThreadContextMap to log4j-core, since no other logging implementation can profit from it.

The documentation has been updated to include the fact that the configuration options for ThreadContext actually depend on the underlying implementation:

Log4j Core users have access to all the ThreadContext configuration options,
Simple Logger users have a more restricted list of config options (they can not use the garbage free map any more). We could even restrict this further by allowing only the log4j2.simplelogShowContextMap and log4j2.isThreadContextMapInheritable options,
Logback users have no configuration options, since the thread context map is hardcoded to use MDC,
JUL users have no configuration options, since the thread context map is hardcoded to no-op.

This PR deprecates #2593.

Jun 24 '24 14:06 ppkarwasz

Before merging this please ensure you look into the issues I had with the performance tests and StringArrayThreadContextMap.

Jun 25 '24 13:06 rgoers

@rgoers,

I can not reproduce the issues in this branch. I suspect that they were due to the changes in the ThreadContextDataInjector (which I reverted).

The performance is similar to the GcFreeSortedArrayThreadContextMap, which seems reasonable:

Benchmark                            (count)  (threadContextMapAlias)  Mode  Cnt     Score    Error  Units
ThreadContextBenchmark.putAndRemove        5                  Default  avgt   25    40,337 ±  2,409  ns/op
ThreadContextBenchmark.putAndRemove        5             NoGcOpenHash  avgt   25    10,500 ±  0,446  ns/op
ThreadContextBenchmark.putAndRemove        5          NoGcSortedArray  avgt   25    46,233 ±  0,484  ns/op
ThreadContextBenchmark.putAndRemove       50                  Default  avgt   25   129,873 ±  2,131  ns/op
ThreadContextBenchmark.putAndRemove       50             NoGcOpenHash  avgt   25    12,334 ±  3,297  ns/op
ThreadContextBenchmark.putAndRemove       50          NoGcSortedArray  avgt   25    65,971 ±  0,742  ns/op
ThreadContextBenchmark.putAndRemove      500                  Default  avgt   25  1192,296 ± 21,985  ns/op
ThreadContextBenchmark.putAndRemove      500             NoGcOpenHash  avgt   25    15,351 ±  2,341  ns/op
ThreadContextBenchmark.putAndRemove      500          NoGcSortedArray  avgt   25   117,499 ±  1,333  ns/op

The poor performance when the context map has 50 or more entries is due to the remove operation, since the array is not sorted and the element was added at the end.

Large context maps and the remove operation itself are rather rare, but I can optimize it by reverting the order of the search.

Aug 15 '24 10:08 ppkarwasz

After reversing the lookup order in remove() the benchmark improves slightly:

Benchmark                            (count)  (threadContextMapAlias)  Mode  Cnt    Score    Error  Units
ThreadContextBenchmark.putAndRemove        5                  Default  avgt   25   37,141 ±  2,795  ns/op
ThreadContextBenchmark.putAndRemove        5             NoGcOpenHash  avgt   25   12,097 ±  2,007  ns/op
ThreadContextBenchmark.putAndRemove        5          NoGcSortedArray  avgt   25   46,352 ±  0,285  ns/op
ThreadContextBenchmark.putAndRemove       50                  Default  avgt   25   82,210 ±  1,496  ns/op
ThreadContextBenchmark.putAndRemove       50             NoGcOpenHash  avgt   25   15,974 ±  2,184  ns/op
ThreadContextBenchmark.putAndRemove       50          NoGcSortedArray  avgt   25   67,429 ±  1,369  ns/op
ThreadContextBenchmark.putAndRemove      500                  Default  avgt   25  731,671 ± 13,308  ns/op
ThreadContextBenchmark.putAndRemove      500             NoGcOpenHash  avgt   25   18,260 ±  5,095  ns/op
ThreadContextBenchmark.putAndRemove      500          NoGcSortedArray  avgt   25  117,670 ±  1,649  ns/op

Aug 15 '24 15:08 ppkarwasz

@rgoers,

Unless we find some problems, I will merge this PR next week. A reduced version of the ThreadContextvsScopedContextBenchmark is in the ThreadContextBenchmark2 class.

Aug 16 '24 09:08 ppkarwasz

I've only reviewed functional and documentation changes, they LGTM. (Did not have time to check the tests.)

Thanks, I am merging it.

Aug 28 '24 13:08 ppkarwasz