prql
prql copied to clipboard
packages / third-party modules
What's up?
Copy-paste from what @snth wrote on Discord
I think a lot of the kind of analytical query work that PRQL is targeted at happens in environments like Notebooks, DBeaver, DuckDB, dbt, Apache Superset, ... and that's why we've been targeting integrations with tools like that. My impression of the current module system proposal is that it very much assumes a file system structure like when you're compiling a Rust program. I think that might often not be the case when working with PRQL and I feel it would be quite important to still be able to "import" modules in those places.
Take working in DBeaver for example: I might have written a module that has a whole lot of convenience functions that do investment performance calculations or whatever, I want to be able to use those wherever I use PRQL, say even in the prql-playground for example.
So what I had envisioned was something more along the lines of require in JS (I don't really know the javascript ecosystem at all so I don't really know how that works) or install_github(...) in R. The compiler would then fetch the packages over the internet and make them available. There are obviously a couple of practical problems with that which would need to be overcome but that would be the basic idea. Potential problem:
- Slowness - would it be possible to cache things? Could maybe write to ~/.config/prql but probably not possible in all environments.
- Security/authenticity - maybe there could be an optional second argument which would be a fingerprint or message digest which gets checked? e.g. something like
module('github.com/snth/utils', '<HMAC_DIGEST_OR_FINGERPRINT>')
from tbl
calc_returns close_price
I think this is an excellent idea!
I think the caching approach would be fine. I'm also not sure how JS does it. We could have a refresh param to the compile function to force a refresh.
I'm a bit less worried about the security with hashes etc — we could have an option for a git repo, with a commit-ish, which could be a hash, if this is important to folks. Otherwise they can clone the files. (No strong view though)
One question — if we don't have use, can this import more than one file — e.g. a tree of files?