pynwb
pynwb copied to clipboard
Generalize Validation
fix #1431 replaces #1494 by branching from dev instead of the modularity improvement
Motivation
WIP synchronizing the behavior of the CLI pynwb.validate
function with the library-imported one, namely for the purposes of validating against cached namespaces (but also just to have identical behavior between library and CLI use cases).
How to test the behavior?
Need to...
- [x] Thoroughly test that the CLI usage behaves as it used to
- [x] The cached namespace feature is indeed available to the new library method as well
- [x] Ensure there is no break in back-combability when using the previous methods
Checklist
- [x] Did you update CHANGELOG.md with your changes?
- [X] Have you checked our Contributing document?
- [X] Have you ensured the PR clearly describes the problem and the solution?
- [X] Is your contribution compliant with our coding style? This can be checked running
flake8
from the source directory. - [X] Have you checked to ensure that there aren't other open Pull Requests for the same change? - a splinter/extension of
enh/getcachednamespaces
- [X] Have you included the relevant issue number using "Fix #XXX" notation where XXX is the issue number? By including "Fix #XXX" you allow GitHub to close issue #XXX when the PR is merged.
Ah, looks like you are still working on this. Happy to help or review it when ready.
@rly @oruebel This is ready to be looked at...
Main issue I'm having right now is getting the non-required argument parsing through docval to work.
Traceback (most recent call last):
File "C:\Users\Raven\.conda\envs\pynwb\lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\Raven\.conda\envs\pynwb\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "d:\github\pynwb\src\pynwb\validate.py", line 223, in <module>
validate_cli()
File "d:\github\pynwb\src\pynwb\validate.py", line 213, in validate_cli
validation_errors, validation_status = validate(
File "C:\Users\Raven\.conda\envs\pynwb\lib\site-packages\hdmf\utils.py", line 648, in func_call
pargs = _check_args(args, kwargs)
File "C:\Users\Raven\.conda\envs\pynwb\lib\site-packages\hdmf\utils.py", line 637, in _check_args
raise ExceptionType(msg)
TypeError: validate: incorrect type for 'namespace' (got 'NoneType', expected 'str or NoneType')
but likewise if I remove the type: (..., type(None))
to mimic the pattern of https://github.com/NeurodataWithoutBorders/pynwb/blob/dev/src/pynwb/image.py#L38-L43 I still get
Traceback (most recent call last):
File "C:\Users\Raven\.conda\envs\pynwb\lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\Raven\.conda\envs\pynwb\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "d:\github\pynwb\src\pynwb\validate.py", line 223, in <module>
validate_cli()
File "d:\github\pynwb\src\pynwb\validate.py", line 213, in validate_cli
validation_errors, validation_status = validate(
File "C:\Users\Raven\.conda\envs\pynwb\lib\site-packages\hdmf\utils.py", line 648, in func_call
pargs = _check_args(args, kwargs)
File "C:\Users\Raven\.conda\envs\pynwb\lib\site-packages\hdmf\utils.py", line 637, in _check_args
raise ExceptionType(msg)
TypeError: validate: incorrect type for 'namespace' (got 'NoneType', expected 'str')
Main issue I'm having right now is getting the non-required argument parsing through docval to work.
Never mind, I just had to set the CLI parser default value to resolve this.
@rly @oruebel Now I'm trying to debug these failing validation CLI tests, but when I locally run the tests it gives
.... Massive stack trace....
FileNotFoundError: [Errno 2] No such file or directory: 'coverage'
How do I obtain this 'coverage' folder with what I presume contains NWB files I need to test against?
I fixed the tests. This looks good to me! Thanks, @CodyCBakerPhD ! Let me know if you think it is good to merge.
@rly Thanks a bunch! I would maybe like to triple check that the API call is 100% compatible with the NWB Inspector (since that also optionally performs validation before the best practices check) which could also clean up a bit of code in the dandi-cli. I'll look into this early next week and let you know
Had to make one small adjustment after testing a thorough integration with the NWBInspector (https://github.com/NeurodataWithoutBorders/nwbinspector/pull/265): https://github.com/NeurodataWithoutBorders/pynwb/pull/1511/commits/be4c65e5ff79dbefe937970a778bff21bb80dc66
When testing validation on a file that had an extension namespace cached within it, that extension itself extended core and so validation against cached namespaces would end up in the condition `namespace 'core' is contained in '
@CodyCBakerPhD I was writing a test for the cached extension condition but I was confused about the last commit.
Currently with this branch, if I have a file with a cached extension that extends core, and I run:
python -m pynwb.validate tests/back_compat/2.1.0_nwbfile_with_extension.nwb --ns core
Then it will validate with the highest cached namespaces and not just "core". I think this is misleading. Previously, this would raise an error saying that "core" is included in namespace "x" and that we should validate against "x" instead.
However, I note that
python -m pynwb.validate tests/back_compat/2.1.0_nwbfile_with_extension.nwb --ns core
python -m pynwb.validate tests/back_compat/2.1.0_nwbfile_with_extension.nwb
are equivalent. I think the no argument condition should validate against all cached namespaces. So if there is an extension that extends "core", it would validate against that and not "core" which is included in the extension. This would be a breaking change.
What do you think?
I think the no argument condition should validate against all cached namespaces. So if there is an extension that extends "core", it would validate against that and not "core" which is included in the extension. This would be a breaking change.
What do you think?
I agree, that does make the most sense for a default argument. Would you be OK with a change like that? I was trying to avoid changing anything major like that.
It's a pretty major breaking change in the behavior of the validate function and CLI. For now, I suggest adding a new flag "--all" to validate against all cached namespaces, reverting your previous change, and using "--all" in the inspector. Then, in a separate PR we can prepare making "--all" the default and changing the name of the file to validation.py.
@rly OK it was a false alert - the source of the issue was actually that as of a few commits ago I had unintentionally changed the default structure of the namespaces
argument which was including the 'core' namespace even though the logic on dev
handles this by only including 'core' if there is nothing else cached.
Reverting to mirror more of the behavior currently on dev
resolved the issue without needing that specific skip logic for core.
As far as including a flag for all
, I think I'll leave that for the future if it's decided that it's still desired/needed, currently I'd say the logic of the lines https://github.com/NeurodataWithoutBorders/pynwb/blob/generalize_validate_from_dev/src/pynwb/validate.py#L66-L68 is sufficient to account for overlapping namespaces that aren't 'core'.
Also thanks for adding that test example, it perfectly replicates what I was seeing on my own test file with that structure.
I determined the 'correct' behavior for the cases - that is, the CLI works as it does on dev (returns the 'core' is contained in 'ndx-...' use that instead), using the API validate with paths
+verbose=True
does the same thing the CLI does (the new intended behavior for this PR) but the API validate with the previous io
+namspace
usage behaves as it used to for maintaining back-compatibility (it explicitly performs the validation against the specified namespace without complaining).
Codecov Report
Merging #1511 (06ca152) into dev (6e1f131) will increase coverage by
0.68%
. The diff coverage is100.00%
.
@@ Coverage Diff @@
## dev #1511 +/- ##
==========================================
+ Coverage 90.65% 91.34% +0.68%
==========================================
Files 25 25
Lines 2494 2507 +13
Branches 466 471 +5
==========================================
+ Hits 2261 2290 +29
+ Misses 148 137 -11
+ Partials 85 80 -5
Flag | Coverage Δ | |
---|---|---|
integration | 70.56% <100.00%> (+0.95%) |
:arrow_up: |
unit | 84.44% <24.71%> (+0.28%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
Impacted Files | Coverage Δ | |
---|---|---|
src/pynwb/__init__.py | 76.15% <100.00%> (-1.22%) |
:arrow_down: |
src/pynwb/validate.py | 100.00% <100.00%> (+21.05%) |
:arrow_up: |
:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more
Coverage on the new validate.py
is now 100%. Woot!
Thanks @CodyCBakerPhD and @rly for all the work on this PR to clean up validation!