prql
prql copied to clipboard
`include` other prql files & module system
Currently, seems we only have 1 stdlib.prql
which will be included when parse every other prql file. maybe we can make this "include" as a feature.
so, I can define some custom functions in separate prql file:
func day_of_week col -> s' ......'
func date_trunc col unit -> s'.....'
while in the main prql, I can first include them, and then use those functions:
prql include:datetime_util
from table1
derive[ a = day_of_week(datetime_col)]
we may support specify library search path for different dialects, like:
--- current folder
-- datetime_util.prql (default one)
-- mysql folder
--- datetime_util.prql (overload one)
Good point.
As we will also add database schema definitions in PRQL #381 , one would probably want to have them in separate files and included in each of the query files.
But this would require some kind of "module" system with hierarchical namespaces. Again (as many things) I like Rust approach to modules:
- each file is a module,
- each directory with
mod.rsis a module - any file can also contain inner modules by using
mod { ... }keyword.
Each module can then have annotations (for example #[cfg(test)]) that includes this module only in test configuration. We could replace such annotations with:
# default impl
func day_of_week col -> s' ......'
mod dialect:mysql {
# overloaded impl
func day_of_week col -> s' ......'
}
What about filename based dialecting so that the primary implementation doesn't need to be altered when new dialects are added. Essentially something like how React Native handles it, which is they allow platform dependent code by using filename patterns.
A directory structure like this might be nice?
[root]
|-[my-lib]
| |-my-lib.prql # Default implementation
| |-my-lib.postgres.prql # PostgreSQL version
| |-my-lib.mysql.prql # MySQL version
|-[other-lib]
| |
And then allowing for an import which can use folders and resolve the modules.
prql dialect:postgres
import my-lib
....
Module resolution would then search through that folder and given the dialect is already chosen would take precedence for the "my-lib.postgres.prql" file by using the suffix in the filename.
Since the dialect options are enumerations anyways seems that can be the proper suffix.
And as a note, filenames may not even be needed if a directory based structure is used. Can just do my-lib/postgres.prql and my-lib/generic.prql if wanted.
@chris-pikul I think that's a great idea!
Is there a specific implementation plan and timetable for this?
We don't have a specific plan, but very open to contributions towards this.
(I would like to get more people involved in development and have been thinking of ways of making the current codebase more approachable, so possibly this is a good case)
I quite like @chris-pikul 's proposal. I think a smaller version could be built non-invasively over the current compiler with something that collected the files and basically concatenated them together. That could run from the CLI without even any import statement; this could be managed from the original command.
Then it would be a modest change to add the import my-lib functionality (though it would require some intermediate rust work since that can't be in the wasm target, which doesn't support files).
Somewhat relatedly, I've also done lots of work on the dbt integration (https://github.com/prql/dbt-prql/), which is a very viable way of building bigger projects of queries (more so than functions though)[^1].
If anyone is interested in exploring this, I & others are very happy to help, hit us up here or on Discord!
[^1]: One note: it currently only works for databases that use backticks for identifiers — e.g. BigQuery — something that I've been trying to engineer around, and discussing with the folks from dbt about).
Ref #2129 #2567 #2570