equinox icon indicating copy to clipboard operation
equinox copied to clipboard

Key/Value Serialization

Open dieterichlawson opened this issue 1 year ago • 3 comments

I recently began using the equinox serialization library to store model checkpoints. My current understanding of that code is that it writes all the non-static pytree fields in a specific order to a file.

While this works, I have found that this approach can be a bit troublesome when code is changing. For example, I often have multiple versions of the code that are basically identical except a different set of fields are marked as static to prevent them from being trained. In this case it's difficult to load checkpoints from other versions of the code because the traversal of the tree is different.

Another very similar issue is that this essentially requires you to construct a pytree of the same structure as the checkpoint before loading it. Often I only want to load portions of a checkpoint, for example when the model has multiple interchangeable components.

More generally, I feel like it would be nice to have the serialized fields stored in a more 'inspectable way' such as a key/value store. This way if you need to load old checkpoints you at least have a clue what the fields are and how to load them without having to dig through your git history and try to resurrect the specific pytree that serialized that checkpoint.

Any thoughts?

dieterichlawson avatar Jun 05 '23 20:06 dieterichlawson