log files are not valid json format
It is a small issue, but annoying non the less. When I try to commit a json log file, it gets flagged by the json5 pre-commit hook.
check json5..............................................................Failed
- hook id: check-json5
- exit code: 1
log.json: Failed to json decode (<string>:2 Unexpected "{" at column 1)
Possible solutions:
- Read of the original file, update it and write the whole updated json
- Do not use the .json extension as stated the documentation, but for example txt instead.
- Hacky way would be to write a
[on first opening of the file. And on subsequent writes start with a,. After the last write do a].
I came across this issue as well, when I need to read the history file and do some plots or analysis. I think a simple solution for the meantime is using something like this: https://pynative.com/python-parse-multiple-json-objects-from-file/
def load_history_line_by_line(directory, file):
data = []
filename = f'{os.path.join(directory,file)}'
with open(filename, "r+") as j:
for line in j:
data.append(json.loads(line))
return data
I think the main issue here is that JSONLogger just json.dumps the individual objects into the history file.
class JSONLogger(_Tracker):
def __init__(self, path):
self._path = path if path[-5:] == ".json" else path + ".json"
try:
os.remove(self._path)
except OSError:
pass
super(JSONLogger, self).__init__()
def update(self, event, instance):
if event == Events.OPTIMIZATION_STEP:
data = dict(instance.res[-1])
now, time_elapsed, time_delta = self._time_metrics()
data["datetime"] = {
"datetime": now,
"elapsed": time_elapsed,
"delta": time_delta,
}
with open(self._path, "a") as f:
f.write(json.dumps(data) + "\n")
self._update_tracker(event, instance)
To fix this, the code would have to be rewritten to first read the history.json file, and add a new entry. Apparently, you need to load a json file before you append to it (https://stackoverflow.com/questions/45869039/appending-data-to-json-file-with-python).
Hmm I see. So basically each new entry would be a valid json file but the aggregate of these is not. That is annoying (but it does explain why a while ago I couldn't get json highlighting to work properly!). Of the possible solutions suggested:
- I think this can work, it would have to be in a 'finally' statement, because it is not uncommon for the optimizer to terminate prematurely.
- I'd be fine with this personally and it's certainly the least amount of work.
- I don't think I like this one, just seems like there's a lot of ways for it to go wrong.
A fourth option:
- Store the entire history in memory (it might actually already be stored?), and at each update json dump the entire history instead of the current entry. I think this will ensure a valid json output at all times will still enabling 'real time' logging. I haven't looked to see how much work it would be though.
All things considered my vote would be for 2 followed by 4.
Option 2 sounds good. Maybe .log would be a good extension?