bids-validator
bids-validator copied to clipboard
Ignore files in _scans.tsv that correspond to entries in .bidsignore (#1366)
PR for handling #1366
Makes use of session storage to save the contents of .bidsignore when it is read originally. This is retrieved later while evaluating _scans.tsv files for ignoring the matching files. All tests with npm run test are passing (except two which were failing even on the unchanged master branch), and two new tests have been added.
For testing purposes, a mock session storage was required to be implemented as Jest is currently configured with "testEnvironment": "node". Could possibly be avoided if set to "testEnvironment": "jsdom". Another option is to employ jest-localstorage-mock package. But the current mock implementation seemed the simplest, and so went with it.
Two tests have been added:
-
should ignore files in scans.tsv that correspond to entries in .bidsignoreThis checks that files to be ignored, based on .bidsignore, are being ignored when evaluating the file list in _scans.tsv -
should not allow missing files listed in scans.tsv and not accounted for by .bidsignoreThis catches files that are listed in _scans.tsv but are actually missing, and not accounted for by .bidsignore
First PR here; apologies if I have missed anything. @effigies: Could you take a look?
In bidscoin, I have added extra lines of code to remove those bidsignore files from the scans.tsv file because the bids-validator was complaining about them. Since the specs don't mention anything about this (at least not that I'm aware of), I simply took the validator output as the definition of the standard. So does this mean that I should not have added those extra lines of code to bidscoin and should just ignore the validator here?
@marcelzwiers I would call this undefined behavior. The spec didn't say what to do, and bidsignore is not a spec-level concept, so the interaction turned out to be a problem.
As a practical matter, I think following the validator when it exceeds the specification is a good idea unless your goal is to force the issue and aim your users to complain to the validator issue tracker. I wouldn't consider that a bug, but you could stop removing the lines, once this or a similar solution is released.
Codecov Report
Attention: Patch coverage is 91.30435% with 2 lines in your changes are missing coverage. Please review.
Project coverage is 85.52%. Comparing base (
d626096) to head (6dacdea). Report is 12 commits behind head on master.
| Files | Patch % | Lines |
|---|---|---|
| bids-validator/utils/getSessionStorage.js | 87.50% | 2 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## master #1914 +/- ##
==========================================
+ Coverage 83.57% 85.52% +1.94%
==========================================
Files 92 132 +40
Lines 3890 6285 +2395
Branches 1271 1549 +278
==========================================
+ Hits 3251 5375 +2124
- Misses 541 806 +265
- Partials 98 104 +6
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
1: [ERR] Internal error. SOME VALIDATION STEPS MAY NOT HAVE OCCURRED (code: 0 - INTERNAL ERROR)
Evidence: ReferenceError: sessionStorage is not defined
at TSV (/home/runner/work/bids-validator/bids-validator/bids-validator/validators/tsv/tsv.js:617:23)
at /home/runner/work/bids-validator/bids-validator/bids-validator/validators/tsv/validate.js:33:9
I'm not sure if global storage is the right tool for this job. @rwblair @nellh Your insight would be appreciated here. I don't think I'm qualified to review this PR.
I did have my doubts about using sessionStorage for solving this problem, and it was infact a fallback approach to get the web app working. I believe the JS web app and the Python package might be using partly the same codebase (?), and if so this will not suffice. My first approach was to re-open the .bidsignore file when analyzing the _scans.tsv file, but this was problematic as .bidsignore is no longer accessible at that stage of the code (i.e. inside tsv.js), and couldn't force read owing to browser security related issues.
Looking forward to some feedback (@rwblair @nellh ) and suggestions on what would be a useful way to tackle this. Happy to try out a different approach.
Please ignore the Python package. It is only useful for packaging regular expressions for validating filenames. It will be replaced entirely in the future.
Just trying to debug the remaining errors with the current approach while waiting for further feedback.
Locally, I am testing the following (based on instructions here) before pushing the changes:
npm run testnpm run lint
But the errors that have been reported above (through the workflows) are never caught/tested via the above. Are there additional instructions for running the entire testing pipeline locally?
I have manually ran the following to ensure previous workflow errors do not arise again:
bids-validator/bin/bids-validator bids-validator/tests/data/valid_headers/ --ignoreNiftiHeaders../../../bin/bids-validator 7t_trt -c /home/shailesh/gits/bids-validator/bids-validator/tests/data/bids-examples/bidsconfig.json --ignoreNiftiHeaders
5 workflows await approval from maintainers to check if all is clear this time around.
In addition to the discussions above about the nature of the changes, I think the PR is passing most of the checks. 3 of them that it is failing, is found to have failed with similar status in a recent merged PR as well, so not sure if it's something specific to this PR that I should look into. I have made a small change to address the 4th failed test which dealt with code coverage; the current change should marginally improve that.
Could we run the remaining 5 workflows to see if they pass?
5 workflows awaiting approval
Also, happy to receive any feedback regarding the changes here.
As mentioned above, I believe it is now passing all the checks that a previous merged PR conformed to. So not sure if the 3 that failed, and 1 skipped, are something that I should look into.
I was wondering if we had any feedback on this PR.
I still think we want the following cases:
| File exists | File is valid | File is ignored | Result |
|---|---|---|---|
| T | T | * | Pass |
| T | F | T | Pass |
| T | F | F | Fail |
| F | * | * | Fail |
It seems like a regression to me to stop erroring when a user mentions a file that does not exist. The goal is to
I spent a little time trying to figure out how to pass ignored files around. I got this far:
❯ git --no-pager diff
diff --git a/bids-validator/validators/bids/fullTest.js b/bids-validator/validators/bids/fullTest.js
index edf0b405..efcd6204 100644
--- a/bids-validator/validators/bids/fullTest.js
+++ b/bids-validator/validators/bids/fullTest.js
@@ -76,8 +76,10 @@ const fullTest = (fileList, options, annexed, dir, schema, callback) => {
}
// remove ignored files from list:
+ const ignoredFiles = {}
Object.keys(fileList).forEach(function (key) {
if (fileList[key].ignore) {
+ ignoredFiles[key] = fileList[key]
delete fileList[key]
}
})
@@ -123,6 +125,7 @@ const fullTest = (fileList, options, annexed, dir, schema, callback) => {
participants,
phenotypeParticipants,
stimuli,
+ ignoredFiles,
)
})
.then(({ tsvIssues, participantsTsvContent }) => {
diff --git a/bids-validator/validators/tsv/tsv.js b/bids-validator/validators/tsv/tsv.js
index 0b76e6cd..153b62b7 100644
--- a/bids-validator/validators/tsv/tsv.js
+++ b/bids-validator/validators/tsv/tsv.js
@@ -36,7 +36,7 @@ const filenameEvidence = (filename) => `Filename: ${filename}`
* specification.
*/
-const TSV = (file, contents, fileList, callback) => {
+const TSV = (file, contents, fileList, ignoredFiles, callback) => {
const issues = []
const stimPaths = []
if (contents.includes('\r') && !contents.includes('\n')) {
@@ -628,7 +628,7 @@ const TSV = (file, contents, fileList, callback) => {
const scanFullPath = scanDirPath + '/' + scanRelativePath
// check if file should be ignored based on .bidsignore content
- if (ig && ig.ignores(path.relative('/', scanRelativePath))) {
+ if (ignoredFiles.includes(scanFullPath)) {
continue
}
diff --git a/bids-validator/validators/tsv/validate.js b/bids-validator/validators/tsv/validate.js
index dba18e7a..dfcd725c 100644
--- a/bids-validator/validators/tsv/validate.js
+++ b/bids-validator/validators/tsv/validate.js
@@ -9,6 +9,7 @@ const validate = (
participants,
phenotypeParticipants,
stimuli,
+ ignoredFiles,
annexed,
dir,
) => {
@@ -34,6 +35,7 @@ const validate = (
file,
contents,
fileList,
+ ignoredFiles,
function (tsvIssues, participantList, stimFiles) {
if (participantList) {
if (file.name.endsWith('participants.tsv')) {
Something's not working right, but I don't understand the validator enough to know what. Perhaps this will help you, though?