LibPQ.jl
LibPQ.jl copied to clipboard
Tool for capturing query parameters as if they were string interpolations
I've recently written a macro tool for conveniently capturing query parameters as part of a private project. I've found it handy, so I wondered whether people would be interested in having this as part of LibPQ. Or if not, whether you have suggestions for a better home for it.
The general idea is that we should be able to build up queries by pasting together fragments of SQL and interpolated parameters like one might do with strings or Cmd
backtick interpolation. But with all interpolations turned into SQL parameters for safety and consistency in converting those to SQL types.
To just paste the readme in here:
The readme
The main thing provided here is the @query
macro to allow queries to be
constructed by normal-looking string interpolation but without danger of SQL
injection attacks.
Note that @query
does not parse or understand the SQL source text as this
would be a lot of work. Instead, it keeps any literal SQL text you write as-is
and only treats the Julia-level string interpolations specially.
Use runquery
to execute queries generated by @query
.
Simple usage
Creating a table and inserting some values
conn = LibPQ.connection(your_connection_string)
runquery(conn, @query "create table foo (email text, userid integer)")
for (email,id) in [ ("[email protected]", 1)
("[email protected]", 2)]
runquery(conn, @query "insert into foo values ($email, $id)")
end
Thence:
julia> runquery(conn, @query "select * from foo") |> DataFrame
2×2 DataFrame
Row │ email userid
│ String? Int32?
─────┼───────────────────────────
1 │ [email protected] 1
2 │ [email protected] 2
Howto: Inserting values from a Julia array into a row
In some circumstances it can be useful to use splatting syntax to interpolate a Julia collection into a comma-separated list of values. Generally simple scalar parameters should be preferred for simplicity, but splatting can be useful on occasion:
email_and_id = ("[email protected]", 3)
runquery(conn, @query "insert into foo values ($(email_and_id...))")
Howto: Using the in
operator with a Julia collection
There's two ways to do this. First, using in
and splatting syntax
julia> ids = (1,2)
runquery(conn, @query "select * from foo where userid in ($(ids...))") |> DataFrame
2×2 DataFrame
Row │ email userid
│ String? Int32?
─────┼───────────────────────────
1 │ [email protected] 1
2 │ [email protected] 2
Second, using the SQL any
operator and simply passing a single SQL array parameter:
julia> ids = [1,2]
runquery(conn, @query "select * from foo where userid = any($ids)") |> DataFrame
2×2 DataFrame
Row │ email userid
│ String? Int32?
─────┼───────────────────────────
1 │ [email protected] 1
2 │ [email protected] 2
Howto: Building up a query from fragments
conn = LibPQ.connection(your_connection_string)
some_condition = true
x = 100
x = 20
# Example of an optional clauses - use empty @query() to disable it.
and_clause = some_condition ? @query("and y=$y") : @query()
# Interpolation of values produces SQL parameters; interpolating @query
# fragments adds them to the query.
q = @query "select * from table where x=$x $and_clause"
runquery(conn, q)
Okay this is pretty great! And best of all it's optional so if someone's using a database which uses the PostgreSQL wire protocol but not the parameter syntax (i.e., $1, $2
) they can just not use the macro.
I'm presuming this functions something like this:
julia> @query "insert into foo values ($(email_and_id...))"
Query("insert into foo values (\$1, \$2)", ("[email protected]", 3))
Would this make sense as a string macro?
I'm presuming this functions something like this
Very similar (though the interpolation syntax is constructed lazily — this allows multiple such strings to be pasted together dynamically, as in the "Buliding up a query from fragments" section. (It would also allow non-PostgreSQL parameter syntax to be supported for other connection types.)
julia> email_and_id = ("[email protected]", 3)
("[email protected]", 3)
julia> @query "insert into foo values ($(email_and_id...))"
insert into foo values ($1,$2)
$1 = "[email protected]"
$2 = 3
julia> dump(@query "insert into foo values ($(email_and_id...))")
JHubDbUtils.SafeQuery
args: Array{Any}((5,))
1: QueryFragment
fragment: String "insert into foo values ("
2: String "[email protected]"
3: QueryFragment
fragment: String ","
4: Int64 3
5: QueryFragment
fragment: String ")"
Would this make sense as a string macro?
Yes I tried this, but I found it has a big downside: interpolations aren't syntax-highlighted. In practice I felt this quite outweighed the benefits of a string macro for visually parsing complex queries. (The benefits of a string macro being a separate namespace and no need for parentheses in larger expressions with trailing parts.)
By the way, I found it helpful to add a runquery
function which was separate from LibPQ.execute()
and only accepted SafeQuery
in an effort to discourage use of string interpolation in constructing queries. I'm not sure how this would make sense if we had @query
in LibPQ itself; maybe it's something application authors could decide to do separately from LibPQ.
By the way, regarding naming I'm not married to using @query
; we could call this @sql
or @psql
, or something else if it made sense. Some people might find it annoying if this clashed with, eg, Query.@query
though that's not an issue for me personally.
In the end, @query
is really just a way of capturing interpolations for processing later. Actually it's somewhat similar to an old PR in Julia Base https://github.com/JuliaLang/julia/pull/33711/files#diff-281780f4de49ab611364d45135bc3d40067c69924b7548e5e73cfec655fb356a. So another option might be to start a new package — LazyStrings
, or some such.
Yes I tried this, but I found it has a big downside: interpolations aren't syntax-highlighted. In practice I felt this quite outweighed the benefits of a string macro for visually parsing complex queries. (The benefits of a string macro being a separate namespace and no need for parentheses in larger expressions with trailing parts.)
If it was a string macro you could choose to use the @macro_str
form if you wanted, whereas it doesn't work the other way around.
The laziness actually presents a nice argument for this being a generic separate package, since the same syntax could be used for many different database systems.
The laziness actually presents a nice argument for this being a generic separate package, since the same syntax could be used for many different database systems.
Yes that's a good point. I've taken a stab at refactoring this a bit and put the code into a new package SqlStrings.jl. I'll probably register a first version of this shortly:
https://github.com/JuliaComputing/SQLStrings.jl
If it was a string macro you could choose to use the
@macro_str
form if you wanted, whereas it doesn't work the other way around.
In SqlStrings, I've made it @sql_cmd
because:
- It's important to have syntax highlighting for interpolations, but editors typically disable this within normal string macros.
-
@sql_cmd
is very conceptually similar to the builtin backticks andBase.Cmd
: it's a lightweight layer which deals only with preserving the structure of tokens in the source text.
At the moment it requires a small amount of glue code to integrate with LibPQ
, but it would be great to make that unnecessary.
I'm not 100% sure of the best way forward for integration. We could make SqlStrings a dependency of LibPQ and make LibPQ.execute
optionally take an SqlString?
A longer term goal might be to make the publicly exported execute()
not take a normal string which would heavily discourage people from using plain string interpolation to construct queries. But obviously that would be quite breaking. Thoughts?