Parameterized queries to prevent SQL injection attacks
Is your feature request related to a problem?
In pyspark and a lot of other SQL engines, it allows you to parameterize your queries like this:
age = 17
spark.sql("select * from customers where age > :age", {"age": age})
The purpose of this is to prevent SQL injection as the function preprocesses the variable before executing the query. This is in contrast to using python f-strings that rely purely on string interpolation. I read the SQL documentation for daft and it doesn't seem to have any parameterized query feature which makes it unsafe for use in user-facing code. It would be great if this feature gets added.
Describe the solution you'd like
Addition of parameterized query feature to daft, ideally in all of the 3 forms:
- Auto-incremented parameters:
daft.sql("select * from customers where age > ? and gender = ?", [20, 'male'])
- Positional parameters:
daft.sql("select * from customers where gender = $2 and age > $1", [20, 'male'])
- Named parameters:
daft.sql("select * from customers where age > :age and gender = :gender", {"gender": "male", "age": 20})
Describe alternatives you've considered
No response
Additional Context
No response
Would you like to implement a fix?
No
Thanks for the suggestion @gahtan-syarif! Agree with you that supporting parameterized SQL queries is important for writing safer code. We've added this request to our backlog. At the moment it's not a P0 or P1 priority for the team, but definitely something we'd like to support in the future as our SQL engine matures.
If you end up being interested in contributing toward this feature, we'd be happy to provide guidance and support!