prql icon indicating copy to clipboard operation
prql copied to clipboard

Design of `translator.rs`

Open max-sixty opened this issue 1 year ago • 2 comments

We're starting to write more dialect-specific code in translator.rs; e.g. https://github.com/prql/prql/pull/858

What's the best design for this?

  • Currently we have a mix of methods on Dialect structs https://github.com/prql/prql/blob/a1919f5504974878a1f8c763cb43cb5cac4f7d7f/prql-compiler/src/ast/dialect.rs#L90-L97 and match expressions in translate_* functions https://github.com/prql/prql/blob/a1919f5504974878a1f8c763cb43cb5cac4f7d7f/prql-compiler/src/sql/translator.rs#L801-L802. If we continue with this, what should go in each?
  • We're sometimes using bare functions like https://github.com/prql/prql/blob/a1919f5504974878a1f8c763cb43cb5cac4f7d7f/prql-compiler/src/sql/translator.rs#L699 and sometimes using From impls https://github.com/prql/prql/blob/a1919f5504974878a1f8c763cb43cb5cac4f7d7f/prql-compiler/src/sql/translator.rs#L901
    • rust-analyzer has upcoming improved support for From, so it'll be less of an issue to track down the specific impl, which I've found inconvenient historically
  • Should we do something like ast_fold.rs, where each Dialect "inherits" from a generic dialect, using this "Fold" pattern to generate a sql AST from constituent parts? This kinda emulates inheritance in Rust.
    • Note that these functions have different input and output types, so it might be a slightly different design
  • I added a question https://github.com/prql/prql/blob/a1919f5504974878a1f8c763cb43cb5cac4f7d7f/prql-compiler/src/ast/dialect.rs#L29-L31 on whether we had the right design for Dialects. We're swapping between them at the moment; e.g. https://github.com/prql/prql/blob/a1919f5504974878a1f8c763cb43cb5cac4f7d7f/prql-compiler/src/sql/translator.rs#L798

Anything else?

max-sixty avatar Jul 27 '22 19:07 max-sixty

I very much agree that dialect specifics should not be part of translator.rs, but reside in dialects.rs or even their own module.

Should we do something like ast_fold.rs, where each Dialect "inherits" from a generic dialect

That's what I intended DialectHandler for. But I didn't think we would need to move functions such as translate_func_call into there. Do you think that's necessary?

We're sometimes using bare functions like ... and sometimes using From impls ... rust-analyzer has upcoming improved support for From ...

Yeah, I dislike From impls for that exact reason - it's hard to find the correct impl. I converted a few From impls into bare functions because they became dialect dependent. Is there a way to remove this Dialect argument? A global static variable?

I added a question on whether we had the right design for Dialects. We're swapping between them at the moment; e.g.

Enum Dialect is required, because it's part of AST and it only makes sense it's an enum. But for DialectHandler we want dialects to be of different types so we can have different impls and have ast_fold-like inheritance.

I think that there is no need to have DialectHandler -> Dialect conversion if we would want to avoid it. Instead of having fn dialect() on DialectHandler, one could add fn allow_dots_in_table_names() or something more descriptive. What I'm saying is that Dialect references one specific dialect and DialectHandler has information about its properties and quirks.

Overall I see that translator.rs and dialects don't have an ideal design, but I can't think of a better one.

aljazerzen avatar Aug 04 '22 07:08 aljazerzen

That's what I intended DialectHandler for. But I didn't think we would need to move functions such as translate_func_call into there. Do you think that's necessary?

We can keep doing these matches; e.g. https://github.com/prql/prql/blob/0873c878f2b9122687b125fa1988b511c80df395/prql-compiler/src/sql/translator.rs#L771-L783 https://github.com/prql/prql/blob/0873c878f2b9122687b125fa1988b511c80df395/prql-compiler/src/sql/translator.rs#L803-L813

But what goes in there vs the DialectHandler? Do lots of those functions end up as big match statements?

Is there a way to remove this Dialect argument? A global static variable?

I don't think so, unfortunately. It might be possible to implement From on a tuple of (foo, Dialect)! :)

I haven't really worked out the best way in rust to decide between bare functions and From; From seems like a nice language feature, though even with the forthcoming rust-analyzer changes, seeing the target type requires hovering. Maybe it's best with "obvious" types like strings and higher cardinality transformations should be bare functions?

I think that there is no need to have DialectHandler -> Dialect conversion if we would want to avoid it. Instead of having fn dialect() on DialectHandler, one could add fn allow_dots_in_table_names() or something more descriptive. What I'm saying is that Dialect references one specific dialect and DialectHandler has information about its properties and quirks.

That could be worth trying. My low-confidence sense is that the information might not "compress" that well — e.g. with the recent BQ quoting issues, it would require quite a lot of data to say whether X.Y.Z vs X.Y.Z is correct (and different between SELECT and FROM), such that it ends up flattening to fn is_bigquery and we're back to writing out an implementation specific to BigQuery.

(This question corresponds to the one at the top of this comment — if we have the DialectHandler doing the whole translation, then this is moot..


Enum Dialect is required, because it's part of AST and it only makes sense it's an enum.

(Something I still don't have a good model for in rust ADTs — having an Enum vs Structs implementing a Trait — the latter seems to be more difficult to integrate into the AST though, given object safety constraints)

max-sixty avatar Aug 04 '22 20:08 max-sixty

I don't think this is relevant anymore.

I'm not saying there is no more room for improvement in sql backend, but these specific notes have been either implemented (DialectHandler is never converted back to Dialect anymore), or have been addressed.

aljazerzen avatar Feb 25 '23 19:02 aljazerzen

Agree, thanks for closing.

I still think as we expand the dialect-specific support, we might want to convert the whole Translator into a Trait which various dialects implement. But it would be a big change, rust doesn't support nuanced inheritance, and so at the moment would be a lot of code for no real benefit...

max-sixty avatar Feb 25 '23 20:02 max-sixty