heavydb icon indicating copy to clipboard operation
heavydb copied to clipboard

bigint-count

Open MarcusGDaniels opened this issue 5 years ago • 1 comments

Confused by the meaning of this:

https://docs.omnisci.com/installation-and-configuration/config-parameters

bigint-count [=arg] Use 64-bit count. Disabled by default because 64-bit integer atomics are slow on GPUs. Enable this setting if you see negative values for a count, indicating overflow. In addition, if your data set has more than 4 billion records, you likely need to enable this setting.

The type for bigint says it is 8 byte. Does this mean the storage format is actually 4 byte unless this is asserted? Do I need to reload with this set?

Marcus

MarcusGDaniels avatar Mar 23 '21 16:03 MarcusGDaniels

No need to reload -- bigint count just uses a 64-bit integer for count(*), etc. The default is to use a 32-bit integer in the output slot, as most use cases will not exceed 32-bits and the atomics are much faster. But, if you see overflows, you need to flip bigint-count on.

Doing this automatically likely wouldn't be too difficult, either by detecting the overflow and starting over / storing the heuristic or by using an upfront heuristic to estimate groups size based on number of groups.

alexbaden avatar Mar 23 '21 16:03 alexbaden