osm-read Why not store IDs as BigInt?

Once I obtain the records containing string ids (including the referenced nodes in the ways) I create new BigInt objects to replace their string representations.

Has any consideration been made of parsing them into BigInt values within osm-read?

Perhaps it would be a useful option to have if it were not to be done by default. Making it an option would avoid breaking changes for those who expect string values.

Nov 27 '22 23:11 metabench

I assume you create the BigInt by invoking it using the string id? For example: BigInt(id)

If this is the case I'm not sure if adding this behavior as a feature flag to osm-read is worth the effort. People which need the id in a number representation can easily do by themself.

Are there any more benefits of parsing the id within osm-read which I have missed @metabench ?

Nov 28 '22 10:11 marook

The earlier it's represented as BigInt the less time strings longer that 8 bytes need to be stored. It's not a big efficiency difference.

Getting the data from osm-read in the most appropriate type is the largest advantage as far as I can tell. Would make programming it easier and maybe a bit more performant.

Nov 28 '22 14:11 metabench

There would likely be less processing to do between the data that's stored in the protobuf and having usable output if it were parsed as BigInt. I don't know whether or not there is anything in the osm-read codebase that would make it difficult to do, such as relying on a schema or dependency which already parses them into strings.

Nov 29 '22 00:11 metabench

Looking at various TODOs such as https://github.com/marook/osm-read/blob/411aba24bc0e413d29d60e0249453c11ff1b8a52/lib/pbfParser.js#L335

There is no problem with integers of the size we get in OSM PBF files, such as for high node IDs. File positions beyond 2^32 are also fine.

"The Number.MAX_SAFE_INTEGER constant represents the maximum safe integer in JavaScript (253 – 1)." - MDN Web Docs.

It's worth noting that the numeric parts beyond 32bit are lost when doing binary operations such as '>>>'.

When representing these numbers in a TypedArray, 64 bit integer types should be used (signed or unsigned will work, but I go for unsigned when I am only supporting unsigned numbers).

Dec 31 '22 14:12 metabench

osm-read osm-read copied to clipboard

Why not store IDs as BigInt?

osm-read
osm-read copied to clipboard