librascal
librascal copied to clipboard
Rascal json encoder
Because of
ASE changed it's JSON structure format in ~3.16. the format we use is so simplistic that we did not write a dedicated function to do it. The ASE reader on the other hand is backward compatible
one cannot simply use ase to transform the files to json I used the ase json encoder as basis to make a rascal encoder https://gitlab.com/ase/ase/blob/master/ase/io/jsonio.py#L25-27
I could put this a bit more cleaned to the python utils. Please give thumbs up if you think this is good or comment if you think we should solve the problem in a different way
import ase.io
import numpy as np
import json
json_frames = {}
frames = ase.io.read('/home/alexgo/datasets/methane.extxyz', ':2')
for frame in frames:
frame.cell = [50, 50, 50]
frame.center()
class RascalEncoder(json.JSONEncoder):
def default(self, obj):
if hasattr(obj, 'todict'):
d = obj.todict()
if not isinstance(d, dict):
raise RuntimeError('todict() of {} returned object of type {} '
'but should have returned dict'
.format(obj, type(d)))
if hasattr(obj, 'ase_objtype'):
d['__ase_objtype__'] = obj.ase_objtype
return d
if isinstance(obj, np.ndarray):
return obj.tolist()
if isinstance(obj, np.integer):
return int(obj)
if isinstance(obj, np.bool_):
return bool(obj)
if isinstance(obj, datetime.datetime):
return {'__datetime__': obj.isoformat()}
if isinstance(obj, complex):
return {'__complex__': (obj.real, obj.imag)}
return json.JSONEncoder.default(self, obj)
for i, frame in enumerate(frames):
json_frames[str(i)] = json.loads(json.dumps(frame, cls=RascalEncoder))
json_frames['ids'] = [i for i in range(len(frames))]
json_frames['nextid'] = len(frames)
with open('/home/alexgo/datasets/methane_test.json', 'w') as f:
json.dump(json_frames, f, indent=2)
While we are doing this, I think there's a more transparent way to encode nparrays, if I'm not mistaken. I find this tolist() a bit burdensome
On Tue, 15 Jun 2021 at 15:21, agoscinski @.***> wrote:
Because of
ASE changed it's JSON structure format in ~3.16. the format we use is so simplistic that we did not write a dedicated function to do it. The ASE reader on the other hand is backward compatible
one cannot simply use ase to transform the files to json I used the ase json encoder as basis to make a rascal encoder https://gitlab.com/ase/ase/blob/master/ase/io/jsonio.py#L25-27
I could put this a bit more cleaned to the python utils. Please give thumbs up if you think this is good or comment if you think we should solve the problem an a different way
import ase.ioimport numpy as npimport json json_frames = {}frames = ase.io.read('/home/alexgo/datasets/methane.extxyz', ':2')for frame in frames: frame.cell = [50, 50, 50] frame.center() class RascalEncoder(json.JSONEncoder): def default(self, obj): if hasattr(obj, 'todict'): d = obj.todict()
if not isinstance(d, dict): raise RuntimeError('todict() of {} returned object of type {} ' 'but should have returned dict' .format(obj, type(d))) if hasattr(obj, 'ase_objtype'): d['__ase_objtype__'] = obj.ase_objtype return d if isinstance(obj, np.ndarray): return obj.tolist() if isinstance(obj, np.integer): return int(obj) if isinstance(obj, np.bool_): return bool(obj) if isinstance(obj, datetime.datetime): return {'__datetime__': obj.isoformat()} if isinstance(obj, complex): return {'__complex__': (obj.real, obj.imag)} return json.JSONEncoder.default(self, obj)
for i, frame in enumerate(frames): json_frames[str(i)] = json.loads(json.dumps(frame, cls=RascalEncoder)) with open('/home/alexgo/datasets/methane_test.json', 'w') as f: json.dump(json_frames, f, indent=2)
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cosmo-epfl/librascal/issues/363, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIREZYHSEBV3NQUIHQZKI3TS5HWPANCNFSM46XIL5MQ .
While we are doing this, I think there's a more transparent way to encode nparrays, if I'm not mistaken. I find this tolist() a bit burdensome
I am not sure what issues appear with tolist() ?
boh, it's an additional conversion that people need to do. I was wondering if we could use a custom encoder, similar to what they do here, second answer https://stackoverflow.com/questions/26646362/numpy-array-is-not-json-serializable
On Thu, 17 Jun 2021 at 19:53, agoscinski @.***> wrote:
While we are doing this, I think there's a more transparent way to encode nparrays, if I'm not mistaken. I find this tolist() a bit burdensome
I am not sure what issues appear with tolist() ?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/cosmo-epfl/librascal/issues/363#issuecomment-863441628, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIREZ2C2HSHJSKTXAPUMR3TTIZBBANCNFSM46XIL5MQ .
The first 4 answers all use tolist()
if isinstance(obj, np.ndarray):
return obj.tolist()