text-show
text-show copied to clipboard
Use Template Haskell to generate the code for generic showb/showt/showtl
By a series unusual circumstances, I generate all of the code used for generically implementing showbPrec
, showtPrec
, and showtlPrec
... using CPP. And it's seriously ugly.
We should replace this with Template Haskell quotes, which will (1) make this look more like actual Haskell code, and (2) make it much more maintainable in the future.
Having taken a quick look at this, I'm not convinced that a Template Haskell-based solution would necessarily be much cleaner. It might even present its own unique set of maintenance challenges.
As an alternative, I'm wondering if it would be better to define a class like like:
class (IsString a, Monoid a) => TextLike a where
showSpace :: a
showParen :: Bool -> a -> a
fromChar :: Char -> a
And then make all of the operations in GTextShow
et al. be polymorphic over a TextLike
constraint. This could have the potential to be slower if the class constraints aren't sufficiently optimized, but I'm somewhat hopeful that GHC can optimize this way. The only way to be sure is to benchmark.
I pushed a proof-of-concept implementation of the ideas in https://github.com/RyanGlScott/text-show/issues/33#issuecomment-1211919793 to the T33-alt
branch. Unfortunately, the benchmarks confirm my fears that this approach is not as efficient as the current, CPP-based approach. Here are the generics-related benchmarks on the master
branch (as of commit d700eb15b64db8d909dd78ea3a997ca1c81005ef) using GHC 9.2.4:
benchmarking TextShow (generics)/Small sample
time 2.751 μs (2.746 μs .. 2.754 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 2.750 μs (2.747 μs .. 2.752 μs)
std dev 6.901 ns (5.313 ns .. 9.403 ns)
benchmarking TextShow (generics)/Medium sample
time 3.628 ms (3.625 ms .. 3.630 ms)
1.000 R² (1.000 R² .. 1.000 R²)
mean 3.619 ms (3.616 ms .. 3.621 ms)
std dev 9.111 μs (7.189 μs .. 11.97 μs)
benchmarking TextShow (generics)/Large sample
time 538.1 ms (530.2 ms .. 549.1 ms)
1.000 R² (1.000 R² .. 1.000 R²)
mean 528.4 ms (522.5 ms .. 533.1 ms)
std dev 5.875 ms (2.538 ms .. 7.021 ms)
variance introduced by outliers: 19% (moderately inflated)
benchmarking Enumeration type/TextShow (generics)
time 146.0 ns (145.8 ns .. 146.2 ns)
1.000 R² (1.000 R² .. 1.000 R²)
mean 145.7 ns (145.5 ns .. 145.9 ns)
std dev 649.6 ps (501.5 ps .. 894.3 ps)
And here are the same benchmarks on the T33-alt
branch:
benchmarking TextShow (generics)/Small sample
time 4.752 μs (4.746 μs .. 4.757 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 4.725 μs (4.714 μs .. 4.736 μs)
std dev 36.83 ns (32.66 ns .. 41.43 ns)
benchmarking TextShow (generics)/Medium sample
time 6.574 ms (6.567 ms .. 6.582 ms)
1.000 R² (1.000 R² .. 1.000 R²)
mean 6.565 ms (6.559 ms .. 6.570 ms)
std dev 15.47 μs (12.14 μs .. 21.05 μs)
benchmarking TextShow (generics)/Large sample
time 902.8 ms (885.9 ms .. 924.3 ms)
1.000 R² (1.000 R² .. 1.000 R²)
mean 887.2 ms (877.0 ms .. 894.8 ms)
std dev 10.09 ms (5.515 ms .. 12.58 ms)
variance introduced by outliers: 19% (moderately inflated)
benchmarking Enumeration type/TextShow (generics)
time 202.5 ns (202.2 ns .. 202.8 ns)
1.000 R² (1.000 R² .. 1.000 R²)
mean 202.7 ns (202.4 ns .. 202.9 ns)
std dev 788.5 ps (677.9 ps .. 989.1 ps)
I'll keep the T33-alt
branch around just in case I have some sort of bright idea on how to optimize it later, but for the time being, I don't think it's feasible to use the TextLike
approach.