containers
containers copied to clipboard
IntMap should offer lookup variant for likely successful lookup
Currently lookup is defined to fail early by checking if the prefix mask agrees with the key:
lookup :: Key -> IntMap a -> Maybe a
lookup !k = go
where
go (Bin p m l r) | nomatch k p m = Nothing
| zero k m = go l
| otherwise = go r
go (Tip kx x) | k == kx = Just x
| otherwise = Nothing
go Nil = Nothing
In the original paper they note that if we expect our lookups to be successful then we should instead prioritize testing if the kth bit is zero and failing only when we reach leaves. If I understand correctly, then this would just be the following code:
lookupSucc :: Key -> IntMap a -> Maybe a
lookupSucc !k = go
where
go (Bin _p m l r)
| zero k m = go l
| otherwise = go r
go (Tip kx x) | k == kx = Just x
| otherwise = Nothing
go Nil = Nothing
Is that correct? It seems to pass the tests in any case. This can certainly lead to large speed ups when we know we are doing many successful lookups. For example, in the lookup benchmark we get the following (this is a best case scenario as the benchmark does not test a map with misses which seems like a bad design for a benchmark):
benchmarked lookup
time 172.2 μs (170.3 μs .. 174.1 μs)
0.998 R² (0.994 R² .. 1.000 R²)
mean 176.6 μs (175.3 μs .. 181.2 μs)
std dev 7.148 μs (2.786 μs .. 15.37 μs)
benchmarked lookupSucc
time 153.0 μs (151.6 μs .. 154.0 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 152.2 μs (151.7 μs .. 152.7 μs)
std dev 1.783 μs (1.306 μs .. 2.362 μs)
Even if this is not quite the correct implementation, containers should probably offer this sort of function to allow users to tune their code to their needs as both variants are worthwhile.
This sounds like an excellent idea. Could you briefly raise the question
of naming on the libraries list? It's even possible we'd want to call it
lookup
, and name the fast-fail version something else. Did you test for
how bad this version is in an often-failing situation?
On Sun, Aug 15, 2021, 11:17 PM Callan McGill @.***> wrote:
Currently lookup is defined to fail early by checking if the prefix mask agrees with the key:
lookup :: Key -> IntMap a -> Maybe a lookup !k = go
where
go (Bin p m l r) | nomatch k p m = Nothing | zero k m = go l | otherwise = go r go (Tip kx x) | k == kx = Just x | otherwise = Nothing go Nil = Nothing
In the original paper they note that if we expect our lookups to be successful then we should instead prioritize testing if the kth bit is zero and failing only when we reach leaves. If I understand correctly then this would just be the following code:
lookupSucc :: Key -> IntMap a -> Maybe a
lookupSucc !k = go
where
go (Bin _p m l r)
| zero k m = go l | otherwise = go r
go (Tip kx x) | k == kx = Just x
| otherwise = Nothing
go Nil = Nothing
Is that correct? It seems to pass the tests in any case. This can certainly lead to large speed ups when we know we are doing many successful lookups. For example, in the lookup benchmark we get the following (this is a best case scenario as the benchmark does not test a map with misses which seems like a a bad design):
benchmarked lookup
time 172.2 μs (170.3 μs .. 174.1 μs)
0.998 R² (0.994 R² .. 1.000 R²)
mean 176.6 μs (175.3 μs .. 181.2 μs)
std dev 7.148 μs (2.786 μs .. 15.37 μs)
benchmarked lookupSucc
time 153.0 μs (151.6 μs .. 154.0 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 152.2 μs (151.7 μs .. 152.7 μs)
std dev 1.783 μs (1.306 μs .. 2.362 μs)
Even if this is not quite the correct implementation, containers should consider offering this sort of function to allow users to tune their code to their needs as both variants are worthwhile.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/haskell/containers/issues/794, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAOOF7IQ4E6T2EOIX6TK47TT5B7LNANCNFSM5CG35XSQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .
Here is how it looks on the original benchmark, the same benchmark with 50% hits, with 90% hits and with all failures:
original fast-fail version:
benchmarked lookup
time 183.5 μs (182.2 μs .. 185.0 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 182.8 μs (182.3 μs .. 183.3 μs)
std dev 1.673 μs (1.292 μs .. 2.405 μs)
benchmarked lookup_half
time 101.4 μs (100.4 μs .. 102.9 μs)
0.999 R² (0.998 R² .. 1.000 R²)
mean 100.9 μs (100.6 μs .. 101.3 μs)
std dev 1.195 μs (828.7 ns .. 1.764 μs)
benchmarked lookup_most
time 169.2 μs (168.3 μs .. 170.2 μs)
1.000 R² (0.999 R² .. 1.000 R²)
mean 168.4 μs (168.0 μs .. 168.9 μs)
std dev 1.558 μs (1.255 μs .. 2.051 μs)
benchmarked lookup_fails
time 23.82 μs (23.70 μs .. 23.97 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 23.95 μs (23.86 μs .. 24.19 μs)
std dev 447.1 ns (176.3 ns .. 882.7 ns)
version without fast-fail
benchmarked lookup
time 152.1 μs (149.6 μs .. 156.0 μs)
0.996 R² (0.992 R² .. 1.000 R²)
mean 154.3 μs (153.0 μs .. 156.3 μs)
std dev 5.418 μs (3.935 μs .. 7.282 μs)
variance introduced by outliers: 17% (moderately inflated)
benchmarked lookup_half
time 151.2 μs (150.2 μs .. 152.6 μs)
0.999 R² (0.999 R² .. 1.000 R²)
mean 151.7 μs (151.2 μs .. 152.4 μs)
std dev 1.902 μs (1.390 μs .. 2.776 μs)
benchmarked lookup_most
time 143.9 μs (143.4 μs .. 144.5 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 145.2 μs (144.8 μs .. 145.9 μs)
std dev 1.829 μs (1.350 μs .. 2.582 μs)
benchmarked lookup_fails
time 150.2 μs (149.0 μs .. 151.7 μs)
0.999 R² (0.999 R² .. 1.000 R²)
mean 150.5 μs (149.9 μs .. 151.5 μs)
std dev 2.544 μs (1.804 μs .. 3.565 μs)
I can definitely ask about naming on the libraries list tomorrow, yes!
I can't see any good reason that find
should not be this variant since it otherwise throws an error.
Agreed regarding find
, but we need to settle the safe versions too.
On Sun, Aug 15, 2021, 11:45 PM Callan McGill @.***> wrote:
I can't see any good reason that find should not be this variant since it otherwise throws an error.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/haskell/containers/issues/794#issuecomment-899191779, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAOOF7LMAHWLUILV5BEHK3TT5CCVLANCNFSM5CG35XSQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .
A name that does not suggest success is query
. This could be a fast-failing alternative to lookup
(which still suggests success, albeit not as strongly as find
).
P.S.: In Agda, we do not use find
, but lookup
which gives us control over the error, e.g.: https://github.com/agda/agda/blob/c9c20bf3683f9eaf9987b80257369be2a45d452b/src/full/Agda/Utils/Graph/AdjacencyMap/Unidirectional.hs#L564
findMaybe
?
Could y'all also please comment on whether you think lookup
should continue to be fast-failing or not?
Maybe lookup should get a HasCallStack
constraint to distinguish call sites, keep stats on whether failure is frequent, persist that data transparently in /var/run/…
and pick the right implementation based on that.
Could y'all also please comment on whether you think
lookup
should continue to be fast-failing or not?
The usual dilemma: (1) some apps out there might have been tested to perform well with the present fast-failing lookup
, (2) other apps might rather use lookup
with success expectation and benefit from a switch to non-fast-failing. Without a field study, we wouldn't know.
In the absence of real-world data, I can only recommend the conservative path: Leave lookup
as it is, but fork it into 2 new variants (e.g. findMaybe
and query
, or lookupShouldSucceed
and lookupMayFail
) that bring the issue to the attention of the user. Possibly deprecate lookup
, but then it is only deprecated for IntMap
, where we strive for a uniform interface for all the collections, and there, lookup
does not split... (complications, sigh!).
We might be overthinking this. The performance difference isn't so great that this change would be worse to some than other likely-optimizations that in the past were simply introduced in prior versions. So making a change should be fine I believe. This does not help us in deciding whether there to make that change or not. Gut feeling says yes: it makes the run time of the function more uniform in general.
Here is how it looks on the original benchmark, the same benchmark with 50% hits, with 90% hits and with all failures:
How did you set up the “all failures” test?
I expect that an “all failures (but uniform distribution of keys)” will behave very different from “all failures (but existing keys are far away from the queries)”. In fact, I expect for usages where the keys of the failed lookups are “within” the range of the existing values, the new code is actually faster for the failed lookups too.
That makes me wonder if it is justifiable to just use new code always?
@nomeata I did not do anything sensible. The current benchmark has keys 12..2^12
so I just shifted them all by 2^12
. In general the IntMap
benchmarks could do with some attention - I'll see if I can add something a bit more reasonable in this change.
@nomeata Your suspicion was correct, if we use lookup on lots of near-misses then the new version is faster.
@sjakobi When making these changes it would be nice to know how it impacts GHC as a real-world consumer of IntMap, do you have any advice on how to do that?
JFTR, conversation has continued at https://github.com/haskell/containers/pull/800
@sjakobi When making these changes it would be nice to know how it impacts GHC as a real-world consumer of IntMap, do you have any advice on how to do that?
Yes, that would be very good to check! Unfortunately I'm not aware of any good documentation on doing this – although it should definitely exist, either in the GHC wiki or even in this repo.
The basic process is that you update the libraries/containers
submodule in GHC to use your branch and then build some sample modules, comparing allocations and timings against a build with an unmodified GHC. Cabal
is a popular sample package to test, because it's directly available in libraries/Cabal
and doesn't have additional dependencies.
I've previously done a tiny bit of testing for #340: See https://github.com/haskell/containers/pull/340#issuecomment-610400875
@doyougnu has done more testing and may have better advice: https://github.com/haskell/containers/pull/340#issuecomment-886565961
comparing allocations and timings
allocations will be useless here, I expect, as (laziness aside) both functions should allocate the same (the return Just/Nothing). Also make sure to change find
as well, I assume GHC uses that a lot as well. Timing is always quite noisy though. For this one, I’d consider a comparison of instruction count a good first indication, though.
Ben Gamari gave me some ideas for what to do in order to measure a cabal build, just waiting on GHC to re-build.
You can read through the commits in this branch to get a step-by-step guide on altering the boot libs (although I'm adding two boot libs in that branch). Instead of adding a boot lib you would just be changing its ref
in $GHC_ROOT/.gitmodules
and updating the submodule.
You'll want to mimic this commit: https://gitlab.haskell.org/doyougnu/ghc/-/commit/297fe09edbfab7c5a358b0485090a4ed8121f427
And these docs will be useful: https://gitlab.haskell.org/ghc/ghc/-/wikis/working-conventions/git/submodules
If you run into trouble I can test it out sometime next week or continue to offer advice here.
I'm +1 for this change and would be interested in observing of ghc
reacts to this increase in lookup misses.