bloomfilter
bloomfilter copied to clipboard
"capacity too large to represent" error
I seem to get the "capacity too large to represent" error no matter how large or small the bytestring list is, using easyList from Data.BloomFilter.Easy; I even tried using Words.hs from your examples and still got it, with input files ranging from 10 to 10,000 lines. With an error rate of 0.01, are there bounds on number of words which can be handled? Am using yyour version 2.0.0.0
to add to the above: am using HP 2013.2, GHC 7.6.3, on Win7. No compiler options. Works fine on FPComplete.
I suspect that bloomfilter 2.0.0.0 is broken on all 32-bit platforms.
I can confirm clinty’s hypothesis, looking at the build failures on Debian: https://buildd.debian.org/status/package.php?p=haskell-bloomfilter&suite=sid
Interestingly, the bit shift code in 2.0.0.0 has not changed from previous versions that didn't fail. It seems that the cause is this additional test in 2.0.0.0: roundedBits > 0xffffffff
Which is indeed the problem, because on 32 bit:
Prelude> maxBound :: Int 214748364 Prelude> 0xffffffff 4294967295 Prelude> 0xffffffff :: Int -15
So, removing that check seems like a reasonable workaround. I guess the fix would involve either using 0x7fffffff on 32 bit, or perhaps using Int64 for these calculations to avoid the 32 bit overflow.
Thanks for copying me. I had just dropped back to 1.2.6.10 which worked for my purposes.
----- Original Message -----
From: "Joey Hess" [email protected] To: "bos/bloomfilter" [email protected] Cc: "self" [email protected] Sent: Friday, September 12, 2014 11:51:05 AM Subject: Re: [bloomfilter] "capacity too large to represent" error (#7)
Interestingly, the bit shift code in 2.0.0.0 has not changed from previous versions that didn't fail. It seems that the cause is this additional test in 2.0.0.0: roundedBits > 0xffffffff
Which is indeed the problem, because on 32 bit:
Prelude> maxBound :: Int 214748364 Prelude> 0xffffffff 4294967295
So, removing that test seems like a reasonable workaround. I guess the fix would involve either using 0x7fffffff on 32 bit, or perhaps using Int64 for these calculations to avoid the 32 bit overflow.
— Reply to this email directly or view it on GitHub .
Hi,
Any ideas why 0xffffffff is used in a first place? Is this a valid max sizing calculation value to check for 64-bit archs?
Is ti wrong to use maxBound :: Int instead?
Regards, Dejan
Can this issue be closed? It appears that 44b01ba fixes it, no?
44b01ba is the commit from 2012 that introduced this bug. So, no, can't be closed yet AFAICS.
FWIW, Debian has patched bloomfilter by simply reverting 44b01ba. This seems to work on all architectures supported by debian, both 32 bit and 64 bit.
Thank you very much for the clarification. I've added an appropriate revert in NixOS, too: https://github.com/NixOS/nixpkgs/commit/2b71e4643e33c427c26efc16b69765170b292cca.
I used stack recently to install bloomfilter on a 32-bit Ubuntu VM, and sure enough, it got the same execution error; if we want stack to offer an 'it just works' environment this should be fixed in Stackage.
----- Original Message -----
From: "Peter Simons" [email protected] To: "bos/bloomfilter" [email protected] Cc: "CrowDaddy" [email protected] Sent: Tuesday, September 1, 2015 9:03:12 AM Subject: Re: [bloomfilter] "capacity too large to represent" error (#7)
Thank you very much for the clarification. I've added an appropriate revert in NixOS, too: NixOS/nixpkgs@ 2b71e46 .
— Reply to this email directly or view it on GitHub .
Fixed this in 2.0.1.2