bigjson icon indicating copy to clipboard operation
bigjson copied to clipboard

Byte position -- wrong for larger than buffer size

Open zerothi opened this issue 10 months ago • 0 comments

https://github.com/henu/bigjson/blob/b562d7be1e8de689cfaf44fdca7a636a8d21ca20/bigjson/filereader.py#L124

The error messages for locating erroneous json files can be quite handy, however, I encounter some problems with deciphering the actual buffer position.

I might be wrong, but I mashed up this small test example:

import bigjson as json

with open("test.json", 'rb') as f:
    try:
        j = json.load(f)["Jobs"]
    except Exception as e:
        print(e)

json.FileReader._READBUF_CHUNK_SIZE = 10
with open("test.json", 'rb') as f:
    try:
        j = json.load(f)["Jobs"]
    except Exception as e:
        print(e)

and then parsing this on a faulty json file resulted in 2 different byte positions.

Here is a snippet of the json file:

{
"Jobs":[
{
"WallTeff":49.88
},
{
"WallTlimit":inf
},
]
}

I get this when running the faulty code:

Unexpected bytes! Value '}' Position 48
Unexpected bytes! Value '}' Position 3

I am not sure how to fix bigjson but I would have expected that both errors showed the same byte-location. I think it has to do with not using _tell_read_pos in the errors. But I am not fully sure?

zerothi avatar Apr 30 '24 08:04 zerothi