Daft icon indicating copy to clipboard operation
Daft copied to clipboard

Parameterized queries to prevent SQL injection attacks

Open gahtan-syarif opened this issue 9 months ago • 1 comments

Is your feature request related to a problem?

In pyspark and a lot of other SQL engines, it allows you to parameterize your queries like this:

age = 17
spark.sql("select * from customers where age > :age", {"age": age})

The purpose of this is to prevent SQL injection as the function preprocesses the variable before executing the query. This is in contrast to using python f-strings that rely purely on string interpolation. I read the SQL documentation for daft and it doesn't seem to have any parameterized query feature which makes it unsafe for use in user-facing code. It would be great if this feature gets added.

Describe the solution you'd like

Addition of parameterized query feature to daft, ideally in all of the 3 forms:

  1. Auto-incremented parameters:
daft.sql("select * from customers where age > ? and gender = ?", [20, 'male'])
  1. Positional parameters:
daft.sql("select * from customers where gender = $2 and age > $1", [20, 'male'])
  1. Named parameters:
daft.sql("select * from customers where age > :age and gender = :gender", {"gender": "male", "age": 20})

Describe alternatives you've considered

No response

Additional Context

No response

Would you like to implement a fix?

No

gahtan-syarif avatar Apr 05 '25 19:04 gahtan-syarif

Thanks for the suggestion @gahtan-syarif! Agree with you that supporting parameterized SQL queries is important for writing safer code. We've added this request to our backlog. At the moment it's not a P0 or P1 priority for the team, but definitely something we'd like to support in the future as our SQL engine matures.

If you end up being interested in contributing toward this feature, we'd be happy to provide guidance and support!

jessie-young avatar Apr 07 '25 23:04 jessie-young