cmlkit
cmlkit copied to clipboard
tools for machine learning in condensed matter physics and quantum chemistry
`test_hash_stable` in `tests/test_dataset.py` has started failing when `numpy` is upgraded somewhere above `1.16.5`. Most likely this is due subtly different treatment of `object` arrays. Luckily, they should become obsolete once...
This is a minor concern, especially if Dataset is being phased out. When creating a Dataset with only single-atom geometries, `dists` is a list of empty arrays and `min_distance` and...
When a Dataset is created with no name but saved with a filename, references to that filename are ignored by certain other Components like tune.Run using a tune.TuneEvaluatorHoldout that was...
Need to rethink this. Currently, anything that is a numpy array gets dumped as ugly binary junk.
JSON is comically faster. Here is a benchmark for dumping/loading 2000 small dicts (3 repeats): ``` JSON: {'times': array([0.38685818, 0.34857985, 0.34886317]), 'mean': 0.36143373133333334, 'min': 0.3485798470000001, 'max': 0.3868581750000001} Yaml: {'times': array([...
title says it all. Upsides: - Once you ran prepare(), restore and run will always do (approximately) the same - If you encounter a `tape.son` in the wild, the information...