bigjson
bigjson copied to clipboard
Byte position -- wrong for larger than buffer size
https://github.com/henu/bigjson/blob/b562d7be1e8de689cfaf44fdca7a636a8d21ca20/bigjson/filereader.py#L124
The error messages for locating erroneous json files can be quite handy, however, I encounter some problems with deciphering the actual buffer position.
I might be wrong, but I mashed up this small test example:
import bigjson as json
with open("test.json", 'rb') as f:
try:
j = json.load(f)["Jobs"]
except Exception as e:
print(e)
json.FileReader._READBUF_CHUNK_SIZE = 10
with open("test.json", 'rb') as f:
try:
j = json.load(f)["Jobs"]
except Exception as e:
print(e)
and then parsing this on a faulty json
file resulted in 2 different byte positions.
Here is a snippet of the json file:
{
"Jobs":[
{
"WallTeff":49.88
},
{
"WallTlimit":inf
},
]
}
I get this when running the faulty code:
Unexpected bytes! Value '}' Position 48
Unexpected bytes! Value '}' Position 3
I am not sure how to fix bigjson
but I would have expected that both errors showed the same byte-location. I think it has to do with not using _tell_read_pos
in the errors. But I am not fully sure?