bids-validator
bids-validator copied to clipboard
Add Python bindings for bids-validator API
Thinking about approaches for improving the accessibility of the validator for Python projects, I took a quick look at how feasible it would be to add Python bindings to the existing project. It looks pretty simple to make a small bridge running V8 as a module with several V8-in-Python modules to pick from. py-mini-racer looks like it can run the built bids-validator without changes except for some problematic reliance on Node.js APIs that are not available in plain V8. We already strip these out for browsers but some scaffolding would be needed to support passing Python standard library file trees into the validator in a sensible way.
Compared to the inverse of this (python interpreter running on wasm in V8), this approach may be preferable since Python environments are more likely to tolerate a heavier dependency like V8, than pyodide and all Python deps in a browser and it allows for reuse of the existing validator code and tests.
Having the bids validator available in Python would be of great interest in many communities.
I have a question about scaffolding and external .js libraries. We have moved most of the HED validation into an external hed-javascript library on npm. I assume that by "problematic reliance on Node.js APIs" you don't mean standard npm libraries?
Could you also provide a link to additional docs on the mechanics of how this would work. The docs for py-mini-racer were somewhat minimal.
I have a question about scaffolding and external .js libraries. We have moved most of the HED validation into an external
hed-javascriptlibrary on npm. I assume that by "problematic reliance on Node.js APIs" you don't mean standard npm libraries?
The main problem is the Node specific fs module, it can be translated to other environments but less easily than many of the other Node builtins. There's tooling to translate it in various ways for browsers that can clean up some of it automatically, but simpler is to isolate the fs calls to some code which does not end up in the browser or Python builds.
It should not affect usage of most npm libraries but if hed-javascript is reading files directly with fs it may need to support an alternative that accepts something like FileTree instead so those references can come from native Python.
For the other dependency limitations, I don't see a lot of issues besides fs. We are using esbuild to bundle bids-validator as one blob including all dependencies already and this runs fine in the browser where there are no Node.js APIs.
Could you also provide a link to additional docs on the mechanics of how this would work. The docs for
py-mini-racerwere somewhat minimal.
https://blog.sqreen.com/embedding-javascript-into-python/ might be helpful.
The Python module would have a similar API to the JavaScript entrypoints. You could call the main BIDS validator with a file tree object representing your BIDS-ish dataset that the shim would translate to FileList-like objects and pass into V8, run the validator, and return a list of issues and a summary object. Some of the utility functions (formatting, single file rules) could also be exposed. If the validator itself crashed, you would get a V8 traceback exception.