[ENH] provide RECOMMENDATION of the "Last, First" form for the `Authors` names
ATM there is no consistency across e.g. OpenNeuro datasets, e.g.
ds006267/dataset_description.json:
Authors=['Katherine M. Cole', 'Shau-Ming Wei', 'Pedro E. Martinez', 'Tuong-Vi Nguyen', 'Michael D. Gregory', 'J. Shane Kippenhan', 'Philip D. Kohn', 'Steven J. Soldin', 'Lynnette K. Nieman', 'Jack A. Yanovski', 'Peter J. Schmidt', 'Karen F. Berman']
ds006269/dataset_description.json:
Authors=['Lucy Pritchard', 'Ingrid Buller-Peralta', 'Sally M Till', 'Peter C Kind', 'Alfredo Gonzalez-Sulser']
ds006303/dataset_description.json:
Authors=['Linke, Julia', 'Naim, Reut', 'Haller, Simone', 'Khosravi, Parmis', 'Scheinberg, Beck', 'Byrne, Meghan', 'Harrewijn, Anita', 'Leibenluft, Ellen', 'Brotman, Melissa', 'Winkler, Anderson', 'Pine, Daniel']
and that is why some are left ambigous like
ds003834/dataset_description.json:
Authors=['Matteo Visconti di Oleggio Castello', 'James V. Haxby', 'M. Ida Gobbini']
where for Matteo I believe there is a composite last name of "Visconti di Oleggio Castello" per e.g.
❯ curl --silent https://raw.githubusercontent.com/bids-standard/pybids/refs/heads/main/.zenodo.json | grep Matteo
"name": "Visconti di Oleggio Castello, Matteo",
but for the other 2 authors, the only last word is the Family name.
TODOs
- [ ] validation: add a check for validator to WARN about using
First Last, in particular if any of the names has more than 2 components? @effigies do you see an easy way to do that? - [ ] anywhere else in the text to add information about this?
- add a check for validator to WARN about using
First Last, in particular if any of the names has more than 2 components? @effigies do you see an easy way to do that?
Regex?
Codecov Report
:white_check_mark: All modified and coverable lines are covered by tests.
:white_check_mark: Project coverage is 82.83%. Comparing base (373da35) to head (53e86b7).
Additional details and impacted files
@@ Coverage Diff @@
## master #2255 +/- ##
=======================================
Coverage 82.83% 82.83%
=======================================
Files 20 20
Lines 1672 1672
=======================================
Hits 1385 1385
Misses 287 287
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
- :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.
I could support a warning for inconsistent comma usage, if any entry in Authors uses a comma either all should or none should. I can't see a way to do this in the current schema, I think we would need to add the equivalent of any(...) in python or some(...) in JS. And we could add all(...) while we are at it. We don't currently have the idea of lambdas in the expression language. Without doing that I imagine instead of a passing a function as an argument it would be a standalone expression language statement. This is then applied to the context, with something added to the scope to represent the current element of the list.
Would any one head this warning given the current noise in the output?
- add a check for validator to WARN about using
First Last, in particular if any of the names has more than 2 components? @effigies do you see an easy way to do that?Regex?
I must have been too tired! ;) the question is now "how". I thought now that most logical would be to add "format" which I pushed, but that might be too restrictive leading to ERRORs right away?
Otherwise, we need some custom rule which would use matches and I guess that is where @rwblair refers of us not having any way to map it across values of a metadata field?
Ha -- so we are not testing against "known to be ok" https://github.com/bids-standard/bids-examples/ which I assume I have broken here? @effigies WDYT -- wouldn't it be worth testing against some "release" (known to be good) of the bids-examples thus preventing "regressions" (prior valid becomes invalid) in the specifications?