sqlparser-rs
sqlparser-rs copied to clipboard
Implement SQL writer
It is often useful to be able to generate SQL from the AST. This opens up possibilities for SQL manipulation or translation. The write should recursively walk the AST and write SQL to a writer / output stream.
A secondary benefit of having a writer is that it often simplifies writing unit tests for the parser, since a SQL statement can be parsed and re-written and then the output can be compared with the input (ignoring whitespace differences)
This is partially implemented already by impl ToString for ASTNode
but this needs to move into the dialect specific code as each dialect will write SQL differently.
I second that a lot! I would love to have a way to translate SQL queries.
I'm thinking of initial design on how to implement a sql writer for each specific dialect.
I believe the easy way to do it, is to make a trait for each of this specific dialect which also leverage
a call to the ToString
to the AST when it is not a dialect specific syntax. Also, we can add a little functionality to extract out all the values into a Vec<Value>
to be later used in parameterized query in order to avoid SQL injection attacks.
Here is a POC code I'm testing.
pub trait ToPostgreSQL: ToString {
fn to_postgresql(&self, args: &mut usize) -> (String, Vec<Value>) {
(self.to_string(), vec![])
}
}
impl ToPostgreSQL for Value {
fn to_postgresql(&self, args: &mut usize) -> (String, Vec<Value>) {
let ret = (format!("${}", args), vec![self.clone()]);
*args += 1;
ret
}
}
impl ToPostgreSQL for DataType {
fn to_postgresql(&self, args: &mut usize) -> (String, Vec<Value>) {
match self {
DataType::Timestamp => ("timestamp with time zone".into(), vec![]),
_ => (self.to_string(), vec![]),
}
}
}
I think this would be great if i understand the goal. I inherited a mess of sql statements for a warehouse and my goal is to create a program to format all queries to one standard, using your lib as the parser. Right now I find myself writing a lot of the same sql as you did in the fmt::Display
implementations so I can format a few expression/identifiers based on configurations. Most of my other formatting is at the clause level
I would like to work on this, as I am working on a tool that needs to export queries to different.
It would be an awesome feature actually to write queries in one SQL dialect and to also convert it (as far as possible of course) to another dialect.
I think the suggestion above makes sense to have a to_postgresql
etc. For each dialect. I am not sure about the need to extract Vec<Value>
?
I am also not sure whether it also needs to do (any) formatting. I think that can be out of scope and maybe more for a external library like https://github.com/maxcountryman/forma/blob/master/formation/benches/formation_bench.rs
Frankly, I don't understand the to_postgresql
suggestion.
FWIW, my instinct is to think about translations between SQL dialects as of AST transformations, rather than different serializations, but It's hard to think about this without looking at a corpus of translations that need to be supported.
Pretty printing an AST like ours is definitely out of scope, but for tasks requiring minor fixups to the SQL, one could want to retain original formatting and comments as much as possible (cf. #175).
@alamb @nickolay @andygrove do you still think this is a reasonable issue?
Although seems quite useful, possibly would demand a lot of extra maintaining structure, which I don't think the project has right now.
There is basically a SQL writer now that converts from an AST to SQL. I didn't read this entire ticket so I am not sure what else is being proposed. Perhaps we can close it as "done" 🤔
@alamb this considers dialect specific implementation as well.
Not sure if that's really feasible. This approach (dialect specific serialisation) was already discussed in another moments and, as always, we agree that that's too much additional code, to such small gains.
We use the SQL writer for PRQL and it does work well, but not for dialects as you pointed out. So we have code on our side that checks how idents should be quoted and whether TOP or LIMIT should be used.
So I consider sqlparser-rs to have an SQL writer, just not SQL dialect translator. So this issue could be closed.
All that said, I came here looking for a fix on VALUES
in MySQL which requires ROW
. But this is an issue targeting both parsing and writing.