containers
containers copied to clipboard
Remove short circuiting for various IntMap functions causes some allocations regressions.
Commit e6fbb98a71c5a72e0805285cbbb799c5f978aceb causes GHCs performance tests to fail. (https://gitlab.haskell.org/ghc/ghc/-/jobs/958207)
InstanceMatching(normal) ghc/alloc 5,089,763,632 5,091,492,392 +0.0%
InstanceMatching1(normal) ghc/alloc 27,713,862,216 27,731,961,608 +0.1%
LargeRecord(normal) ghc/alloc 6,041,807,976 6,167,383,288 +2.1% BAD
ManyAlternatives(normal) ghc/alloc 731,728,384 732,205,472 +0.1%
ManyConstructors(normal) ghc/alloc 3,906,708,768 3,907,866,576 +0.0%
MultiComponentModules(normal) ghc/alloc 1,792,077,072 1,791,894,352 -0.0%
MultiComponentModulesRecomp(normal) ghc/alloc 739,981,312 739,973,248 -0.0%
MultiLayerModules(normal) ghc/alloc 3,108,214,752 3,107,719,088 -0.0%
MultiLayerModulesRecomp(normal) ghc/alloc 854,776,280 854,763,816 -0.0%
MultiLayerModulesTH_Make(normal) ghc/alloc 628,756,288 631,253,840 +0.4%
MultiLayerModulesTH_OneShot(normal) ghc/alloc 2,708,437,104 2,708,683,336 +0.0%
PmSeriesG(normal) ghc/alloc 43,846,344 44,058,552 +0.5%
PmSeriesS(normal) ghc/alloc 53,097,680 53,147,336 +0.1%
PmSeriesT(normal) ghc/alloc 76,295,096 76,708,768 +0.5%
PmSeriesV(normal) ghc/alloc 52,503,896 52,625,760 +0.2%
T10421(normal) ghc/alloc 110,969,584 111,078,784 +0.1%
T10421a(normal) ghc/alloc 77,539,344 77,582,480 +0.1%
T10547(normal) ghc/alloc 27,836,328 27,878,480 +0.2%
T10858(normal) ghc/alloc 126,696,896 126,758,272 +0.0%
T11195(normal) ghc/alloc 234,317,584 234,768,200 +0.2%
T11276(normal) ghc/alloc 93,173,784 93,327,368 +0.2%
T11303b(normal) ghc/alloc 40,094,728 40,074,024 -0.1%
T11374(normal) ghc/alloc 191,132,936 191,134,640 +0.0%
T11545(normal) ghc/alloc 939,163,880 939,193,288 +0.0%
T11822(normal) ghc/alloc 107,042,504 107,212,856 +0.2%
T12150(optasm) ghc/alloc 77,408,880 77,454,568 +0.1%
T12227(normal) ghc/alloc 473,925,064 480,160,096 +1.3% BAD
T12234(optasm) ghc/alloc 55,823,640 55,835,792 +0.0%
T12425(optasm) ghc/alloc 87,115,072 87,270,448 +0.2%
T12545(normal) ghc/alloc 1,667,437,672 1,720,649,048 +3.2%
T12707(normal) ghc/alloc 943,607,728 945,187,400 +0.2%
T13035(normal) ghc/alloc 101,286,568 101,366,424 +0.1%
T13056(optasm) ghc/alloc 362,261,640 363,079,832 +0.2%
T13253(normal) ghc/alloc 338,486,424 339,347,752 +0.3%
T13253-spj(normal) ghc/alloc 123,347,384 123,449,552 +0.1%
T13379(normal) ghc/alloc 352,877,984 352,973,008 +0.0%
T13386(normal) ghc/alloc 867,199,000 880,838,576 +1.6% BAD
T13701(normal) ghc/alloc 2,392,939,488 2,392,479,344 -0.0%
T13719(normal) ghc/alloc 4,971,686,840 4,933,841,200 -0.8%
T14052(ghci) ghc/alloc 3,613,484,344 3,616,095,496 +0.1%
T14052Type(ghci) ghc/alloc 6,323,987,176 6,324,274,824 +0.0%
T14683(normal) ghc/alloc 2,986,938,912 3,008,927,472 +0.7%
T14697(normal) ghc/alloc 342,160,192 342,064,080 -0.0%
T15164(normal) ghc/alloc 1,285,202,776 1,288,030,896 +0.2%
T15304(normal) ghc/alloc 1,246,949,792 1,250,268,336 +0.3%
T15630(normal) ghc/alloc 157,893,456 157,940,168 +0.0%
T15703(normal) ghc/alloc 514,823,488 522,608,456 +1.5% BAD
T16577(normal) ghc/alloc 7,567,314,120 7,649,470,920 +1.1%
T16875(normal) ghc/alloc 34,586,464 34,578,160 -0.0%
T17096(normal) ghc/alloc 217,397,568 217,925,448 +0.2%
T17516(normal) ghc/alloc 1,717,467,456 1,721,208,832 +0.2%
T17836(normal) ghc/alloc 837,839,944 839,389,344 +0.2%
T17836b(normal) ghc/alloc 45,019,864 45,008,160 -0.0%
T17977(normal) ghc/alloc 39,827,664 39,800,408 -0.1%
T17977b(normal) ghc/alloc 37,605,824 37,571,680 -0.1%
T18140(normal) ghc/alloc 75,405,160 75,410,736 +0.0%
T18223(normal) ghc/alloc 1,117,577,272 1,148,818,384 +2.8% BAD
T18282(normal) ghc/alloc 148,271,824 148,390,728 +0.1%
T18304(normal) ghc/alloc 90,673,016 90,676,992 +0.0%
T18478(normal) ghc/alloc 507,089,832 508,140,656 +0.2%
T18698a(normal) ghc/alloc 199,865,624 200,035,808 +0.1%
T18698b(normal) ghc/alloc 221,130,064 221,332,784 +0.1%
T18923(normal) ghc/alloc 70,802,584 70,873,328 +0.1%
T1969(normal) ghc/alloc 749,857,072 750,288,264 +0.1%
T19695(normal) ghc/alloc 1,422,947,528 1,426,109,104 +0.2%
T20049(normal) ghc/alloc 89,814,808 89,759,520 -0.1%
T20261(normal) ghc/alloc 616,696,800 621,478,856 +0.8%
T3064(normal) ghc/alloc 202,526,136 205,106,112 +1.3%
T3294(normal) ghc/alloc 1,623,091,088 1,622,053,560 -0.1%
T4801(normal) ghc/alloc 302,406,944 303,090,288 +0.2%
T5030(normal) ghc/alloc 351,294,664 359,386,352 +2.3% BAD
T5321FD(normal) ghc/alloc 249,192,352 249,317,824 +0.1%
T5321Fun(normal) ghc/alloc 276,394,384 276,466,184 +0.0%
T5631(normal) ghc/alloc 565,954,304 577,675,064 +2.1% BAD
T5642(normal) ghc/alloc 462,959,792 468,391,080 +1.2%
T5837(normal) ghc/alloc 35,721,056 35,682,016 -0.1%
T6048(optasm) ghc/alloc 100,822,496 100,887,408 +0.1%
T783(normal) ghc/alloc 380,353,976 380,735,504 +0.1%
T8095(normal) ghc/alloc 3,124,233,344 3,243,822,096 +3.8% BAD
T9020(optasm) ghc/alloc 242,711,104 245,175,384 +1.0%
T9198(normal) ghc/alloc 46,243,688 46,223,936 -0.0%
T9233(normal) ghc/alloc 715,117,200 714,845,768 -0.0%
T9630(normal) ghc/alloc 1,510,533,960 1,520,912,432 +0.7%
T9675(optasm) ghc/alloc 437,013,336 436,703,880 -0.1%
T9872a(normal) ghc/alloc 1,718,750,472 1,685,475,544 -1.9% GOOD
T9872b(normal) ghc/alloc 1,963,440,192 1,945,595,120 -0.9%
T9872b_defer(normal) ghc/alloc 2,996,925,768 2,981,205,512 -0.5%
T9872c(normal) ghc/alloc 1,681,215,352 1,663,474,808 -1.1% GOOD
T9872d(normal) ghc/alloc 396,884,424 397,837,272 +0.2%
T9961(normal) ghc/alloc 351,621,368 352,289,600 +0.2%
TcPlugin_RewritePerf(normal) ghc/alloc 2,168,137,720 2,150,244,128 -0.8%
WWRec(normal) ghc/alloc 588,926,384 589,396,496 +0.1%
hard_hole_fits(normal) ghc/alloc 480,547,232 480,925,096 +0.1%
hie002(normal) ghc/alloc 9,747,820,752 9,380,086,064 -3.8%
parsing001(normal) ghc/alloc 452,303,240 452,305,024 +0.0%
The issue for us in Data.IntMap.Internal.lookup, the modification of this function makes it small enough that Worker/Wrapper doesn't trigger on it:
https://gist.github.com/mpickering/74590e7ababb7a82abc4b04749456ccd
The consequence is that different things get inlined into lookupVarEnv (a function in GHC)
https://gist.github.com/mpickering/cd5a3c7baf1ae678d1a830cf76b2c56e
Then lookupVarEnv doesn't get inlined (because it is now too big) and every time it's called a function is allocated.
OTOH, if the function is like how it was before then the wrapper gets inlined into lookupVarEnv, lookupVarEnv is the inlined and the worker subsequently inlined into the right place.
Great analysis - but my conclusion is not
Commit https://github.com/haskell/containers/commit/e6fbb98a71c5a72e0805285cbbb799c5f978aceb causes GHCs performance tests to fail.
but this instead:
GHC's (quite unpredictable) inlining mechanics causes GHCs performance tests to fail.
It's not clear to me that rolling back the commit here would bring a stable improvement for GHC. The discussion https://github.com/haskell/containers/issues/794 https://github.com/haskell/containers/pull/800 seemed to show that there were some good reasons, backed up with measurements.
@nomeata perhaps saw something coming: https://github.com/haskell/containers/pull/800#issuecomment-922498670 Now would "offer both" help?
Agreed; this commit can't be blamed for subtle inlining interactions, unless it does something wrong with inlining.