nfcompose icon indicating copy to clipboard operation
nfcompose copied to clipboard

Alternative to hashing uuid generation strategy

Open s4ke opened this issue 1 year ago • 0 comments

Currently we only support the rather wasteful pattern of generating canonical ids for DataPoints using this logic:

https://github.com/neuroforgede/nfcompose/blob/main/skipper/skipper/dataseries/storage/uuid.py

def _gen_uuid(data_series_id: Union[uuid.UUID, str], external_id: str) -> str:
    computed_id = hashlib.sha256(external_id.encode('UTF-8')).hexdigest()
    return f'{str(data_series_id)}-{str(computed_id)}'

This is rather wasteful. Changing this without coordination will cause issues, though: Uniqueness inside a dataseries depends on a consistent implementation of this logic. We could however add a configuration setting to the DataSeries that allows for different uuid stategies to be used - e.g. less wasteful hash functions or just the identity function.

While this is not an urgent issue, this is something that could be useful.

s4ke avatar May 14 '23 18:05 s4ke