Proposal: `measurementLists` Group as alternative to many `measurementList1`...`measurementList2` indexed groups
It has been highlighted by #103 that the Indexed Group, described as
Each element of the sub-group is uniquely identified by appending a string-formatted index (starting from 1, with no preceding zeros) in the name, for example, /.../name1 denotes the first sub-group of data element name, and /.../name2 denotes the 2nd element, and so on.
is a wildly inefficient way to structure an HDF5 file.
This draft adds an alternative encoding of the measurementList.
There are a few known issues at this point, such as defining a character to be NaN at the index of channels which lack a particular value.
I am not in love with "measurementLists" btw...
@sstucker thank you for tackling this, the proposed changes capture the proposed changes. Is some more general opening explanatory text required to explain the two options?
Given that it's > 1 year for this, should we assume it's not relevant anymore?
Issue #103 is still a problem and this solution is a sane way to handle it. I will return to this before the end of the year.
This is an old issue but it remains pertinent and it would be nice to get this merged.
Other than resolving the review comments, what needs to be done (beyond merging) to make sure that this is properly supported in, e.g, the validator?
@sstucker thank you for any help, much appreciated!
Could you clarify where the NaN value situation may arise? What required fields do you envisaged being omitted in practice?
@samuelpowell I do hope we can make good progress when we meet in 10 days. It would be great to resolve this soon. I am also now hitting the same issue you described way back in https://github.com/fNIRS/snirf/issues/103#issue-1166649354.
@sstucker thank you for any help, much appreciated!
Could you clarify where the NaN value situation may arise? What required fields do you envisaged being omitted in practice?
There is no rule enforcing that any of the optional fields must be present for all channels in the list. I don't know why this would occur, but it can, strictly speaking.
Following discussion in May 24 meeting, it was decided that this feature should be implemented as described.
The issue regarding missing fields, which can arise when e.g. both processed and raw data are stored together was also discussed. The conclusion here is that a permissive approach is reasonable - when a data field is not required by a data type (e.g. a wavelength index for a chromphore) the value, if present, is ignored. This will be discussed further in #119.
@sstucker will this work as implemented, e.g., will the validator be able to parse **Presence**: required if measurementLists is not present, or is additional work neccesary?
@samuelpowell It will require some work from me on the validator ahead of the next release, but I can do it. Let's get the ball rolling towards the release of the feature by merging this.
Okay, I've addressed one of David's comments, merged master and updated the spell check.
Before proceeding, what do you think of @dboas question "for all the places in this chunk of code where we say "required", do we need to clarify that this is "required if measurementLists is utilized?"
Further to meeting of 10/07 it was determined that it would be good to include the text "required if measurementLists is present" in order to ensure the spec is literal, however the implementation overhead should be considered.
@sreekanthkura7 to check the validator to determine
- if there is a required field nested underneath an optional group, is its presence still enforced in the absence of the group, and,
- what is the complexity of supporting the condition "required if measurementLists is present" (note that in this case the constraint can be ignored if the validator does not enforce required fields under absent optional groups)
I modified the spec to indicate that required fields of measurementLists are required if measurementLists is present
Sreekanth and I discussed the validator changes needed. He will work with Stephen on the validator once the pull request is merged.
I'd like to get #151 merged first, then we can remove the same fields in this PR too. If anyone able to review the same we can make progress.
@sreekanthkura7 , #151 has been merged. Can you remove the same fields from this PR as Sam requested in the comment above?
@dboas , It looks like I don't have permission to edit this. Can someone please do this or give me the access to do?
@sreekanthkura7 you should now be able to push commits to this PR
@samuelpowell The fields that were removed in #151 have now been removed in this PR as well
Thanks @sreekanthkura7 I will re-review by Monday
I think there was some confusion here @sreekanthkura7, the request from myself / @dboas was not to remove the exact same fields as in #151, as these would be removed by merging master into this branch. It was instead to remove the associated fields under measurementLists which no longer existed in measurementList(k). Have added a review which I hope is clear.
@sreekanthkura7 I just noticed that the table of contents needs to be updated to link to measurementLists and all the sub-fields
@samuelpowell @dboas The changes discussed above have been implemented. Additionally, I updated the measurementLists in the summary table. However, I am uncertain if the type column for the measurementLists was updated correctly. Please review and let me know if any further adjustments are needed.
@sstucker, now that this PR is merged, do you still have time to assist with the validator? I'm available to work with you on this as well.
@sreekanthkura7 Yes, I can start a draft release of the validator. I will want to align the validator to the entire draft SNIRF release, so it would be good to get a changelog and draft release notes for the next SNIRF online soon.