bids-validator icon indicating copy to clipboard operation
bids-validator copied to clipboard

Report unescaped backslashes in JSON sidecar files

Open effigies opened this issue 2 years ago • 6 comments

We've found a dataset on OpenNeuro with unescaped backslashes in their bold.json files. The dataset validates, but attempting to read the files with jq or Python's json.load results in errors.

Here's an example file that should fail:

{"Field": "String with \ character"}

xref https://github.com/poldracklab/tacc-openneuro/issues/51

effigies avatar Dec 23 '22 19:12 effigies

Something has changed since that dataset validated on OpenNeuro. Now we crash the HED validator:

$ bids-validator --ignoreSubjectConsistency /data/bids/ds003459/
[email protected]
(node:49452) Warning: Closing directory handle on garbage collection
(Use `node --trace-warnings ...` to show where the warning was created)
	1: [ERR] Internal error. SOME VALIDATION STEPS MAY NOT HAVE OCCURRED (code: 0 - INTERNAL ERROR)
		Evidence: TypeError: Cannot convert undefined or null to object
    at Function.entries (<anonymous>)
    at BidsSidecar.filterHedStrings (/usr/lib/node_modules/bids-validator/dist/commonjs/cli.js:135498:43)
    at new BidsSidecar (/usr/lib/node_modules/bids-validator/dist/commonjs/cli.js:135494:18)
    at /usr/lib/node_modules/bids-validator/dist/commonjs/cli.js:153338:12
    at Array.map (<anonymous>)
    at constructSidecarData (/usr/lib/node_modules/bids-validator/dist/commonjs/cli.js:153337:30)
    at checkHedStrings (/usr/lib/node_modules/bids-validator/dist/commonjs/cli.js:153303:23)
    at Object.events_default (/usr/lib/node_modules/bids-validator/dist/commonjs/cli.js:153380:10)
    at /usr/lib/node_modules/bids-validator/dist/commonjs/cli.js:154256:28

	Please visit https://neurostars.org/search?q=INTERNAL ERROR for existing conversations about this issue.

	1: [WARN] Not all subjects/sessions/runs have the same scanning parameters. (code: 39 - INCONSISTENT_PARAMETERS)
		./sub-203/func/sub-203_task-visortho_acq-a_bold.nii.gz
		./sub-263/func/sub-263_task-audphono_acq-b_bold.nii.gz
		./sub-269/func/sub-269_task-visortho_acq-a_bold.nii.gz

	Please visit https://neurostars.org/search?q=INCONSISTENT_PARAMETERS for existing conversations about this issue.

        Summary:                 Available Tasks:                      Available Modalities: 
        527 Files, 5.91GB        Auditory Phonology Judgement          MRI                   
        34 - Subjects            Auditory Orthography Judgement                              
        1 - Session              Auditory Semantic Judgement                                 
                                 Auditory Syntax Judgement                                   


	If you have any questions, please post on https://neurostars.org/tags/bids.

Reproduced on web validator as well. Bisecting back, it switched from passing to failing in 1.8.5, but the failure is in HED, not the validator, and it is not informative.

effigies avatar Dec 26 '22 20:12 effigies

We'll take a look... @happy5214 do you have any thoughts?

VisLab avatar Dec 26 '22 22:12 VisLab

It appears to be attempting to create a sidecar object with non-existent data. I'm wondering why https://github.com/bids-standard/bids-validator/blob/40206be37d9db2114ed16e8891625f65a1aab449/bids-validator/validators/events/hed.js#L75 is using a union rather than an intersection - maybe that'll fix it?

happy5214 avatar Dec 26 '22 22:12 happy5214

@happy5214 That definitely resolves the HED error, so I've submitted #1574. I'm not sure how to write a good test here...

That said, this also brings us back to the point where the unescaped backslashes don't cause errors, so this issue isn't fully closed by that fix.

effigies avatar Dec 27 '22 17:12 effigies

@happy5214 That definitely resolves the HED error, so I've submitted #1574. I'm not sure how to write a good test here...

For the intersection? The problem is that only the main function gets exported, and the only thing it returns (in the form of a Promise) is an issue list. Ideally, you'd check the BidsDataset object and its fields, but that never sees the light of day outside of that main function. My suggestion would be to add a fake "sidecar" that isn't a potential location of any event files in the dataset, include invalid HED data in it, and assert that it still passes validation.

happy5214 avatar Dec 27 '22 17:12 happy5214