Code duplication around tensorstore_spec logic in orbax-checkpoint
Hi Orbax team,
I was looking at Orbax code at the latest version==0.7.0 and found that pieces of code with quite heavy logic around tensorstore_spec creation seem to contain duplicates.
I'd like to know if this code duplication intended by design or I am welcome to submit a PR.
Here get_tensorstore_spec is a part of public API, and I can't find any usage of get_tensorstore_spec by orbax-checkpoint itself
https://github.com/google/orbax/blob/8b4e90d573082a5c7caa5f99c51db376f62a6995/checkpoint/orbax/checkpoint/serialization.py#L97C5-L124
And here is a very similar piece of code in build_kvstore_tspec in _internal package, and build_kvstore_tspec is used heavily by type_handlers.py
https://github.com/google/orbax/blob/8b4e90d573082a5c7caa5f99c51db376f62a6995/checkpoint/orbax/checkpoint/_src/serialization/tensorstore_utils.py#L62-L135
Would you consider get_tensorstore_spec to reuse build_kvstore_tspec under the hood?
Also, there seems to be a bit of obscurity with default ts_context value.
- In
orbax/checkpoint/serialization.py, there is TS_CONTEXT in publicserialization.pythat is used as a default value ofcontextin async_serialize (actually,orbaxdoes not useasync_serializeanywhere and recommends usingasync_serialize_shards) , async_serialize_shards, async_deserialize and by StringHandler. - At the same time, in
type_handlers.py, there is get_ts_context() (it references_DEFAULT_OCDBT_TS_CONTEXT), andget_ts_contextis used by all other handler implementations.
So, TS_CONTEXT from serialization.py seems to be never used by common checkpoint IO code.
Should we somehow leave only 1 source of truth for default ts_context values?
This code is currently in the process of a rework by @dicentra13. get_tensorstore_spec in serialization.py should be either removed or should reuse build_kvstore_tspec. I think it is only used in a couple places in internal code - just due slower progress in refactoring rather than any inherent need.
Again, async_serialize is used in one place internally that I'm working on eliminating, although probably will leave that function in place as a wrapper.
Agreed that ts_context should use one source of truth, that could use fixing.
Got it @cpgaffney1 , thank for the reply! Waiting for the refactoring by @dicentra13 :)
Let's close this issue once refactoring is completed and merged