pynwb icon indicating copy to clipboard operation
pynwb copied to clipboard

Make NWBFile identifier optional

Open rly opened this issue 5 years ago • 7 comments

2) Feature Request

As per our discussion at the hackathon re: session_id and identifier of NWBFile, identifier should no longer be required. session_id should be highly encouraged, but in order to maintain backwards compatibility, I think we should not make it required.

rly avatar May 22 '19 19:05 rly

Is the idea to auto-generate identifier when it is not provided?

bendichter avatar May 28 '19 21:05 bendichter

An NWBFile (and all other neurodata_types) now have practically unique object IDs, which can serve the purpose of identifying an NWBFile and its contents*. As such, I think we can make identifier optional now and not auto-generate it.

rly avatar Aug 08 '19 19:08 rly

Making identifier optional is fine. I think we may want to update the doc for the identifier field as well. Currently the doc discusses an example on how to construct the identifier. Instead, I think we should just say that is should be unique, and use DOI as an example.

oruebel avatar Aug 08 '19 19:08 oruebel

Note that making identifier optional in the API will break backwards compatibility of the API. i.e. calls such as:

nwbfile = NWBFile('description, 'id', start_time)

will no longer work. Because identifier will have a default value and session_start_time does not, the positional arguments of __init__ will be parsed as: session_description, session_start_time and then identifier (if I just make identifier the first keyword argument).

To help users keep their existing code working at least for this basic example, I propose another change: rename NWBFile to Session (a change we were thinking of making anyway) and keep NWBFile as an alias to Session with the old argument ordering. What do you think?

rly avatar Aug 09 '19 16:08 rly

@rly I'm more concerned about backwards compatibility of data than conversion scripts. Would there be any change to the schema? If we rename NWBFile to Session I'm pretty sure we'll cause issues for data that is already written. If not, then there will be a pretty big difference between the API and the schema.

bendichter avatar Aug 09 '19 21:08 bendichter

Hm. I see your point. Yes, it would be pretty confusing to have the API and schema refer to different names for the NWBFile/Session. Ok, never mind.

rly avatar Aug 09 '19 21:08 rly

Proposed solution from @oruebel @ajtritt @bendichter : change the schema to make identifier optional, and do not change the API until the next major version of the NWB schema & PyNWB. For now, in PyNWB, if identifier is empty string, do not populate the identifier dataset.

rly avatar Nov 16 '19 00:11 rly