postgresql-simple icon indicating copy to clipboard operation
postgresql-simple copied to clipboard

Should formatting functions be pure?

Open adinapoli opened this issue 11 years ago • 6 comments

Hi there,

my plan was to write some specs to test my queries, generate via the "sql" QuasiQuoter are sound. For doing that I wanted to keep everything pure and reusable. I have seen there are two function for debugging purpose called formatMany and formatQuery, but none of them is pure. Is there a particular reason? What is inconvenient also is that I have to pass in a Connection, which seems a bit "warty" considering the formatting purpose of the functions. I realise the reason is that because those functions reused internally buildQuery, but shouldn't be possible to refactor the connection independent part of that function out? If not, what's preventing that?

This is definitely a "nice-to-have" thing more than a real urge, and I reckon changing the signature of those functions would break a couple of depending projects as well.

Thanks! Alfredo

adinapoli avatar Nov 03 '14 16:11 adinapoli

The issue is that the string escape syntax depends on the standard_conforming_strings connection variable, which as of 8.2 or 8.3 can be set on a per-connection basis. IIUC, that variable is initialized via the Grand Unified Configuration (GUC) system, based on global configurations, specific database settings, specific database user settings, etc etc.

Postgresql-simple automatically sets standard_conforming_strings to on during connection initialization if the reported database version supports it. There's nothing stopping user code from changing this setting later, though.

So that's the reason why it is the way it is, otherwise postgresql-simple would be vulnerable to sql injections in certain deployment contexts. IIRC, libpq 9.0 implemented a new string escape method that uses the extended escape string syntax that works independently of this setting, and although the new method also takes a connection pointer as a parameter, it ignores it.

There's a couple of reasons I really don't like the new method, however, compared to the old. There are some good reasons to re-implement libpq's escape functions in Haskell, and then we should be able to use the extended escape syntax to make a pure escape function. But this takes some care to implement correctly.

lpsmith avatar Nov 05 '14 02:11 lpsmith

Gotcha :) perhaps would be worthwile to add part of your excellent answer as part of those function haddocks, so that in a few months time someone else will come up with my very same question? Feel free to close this ticket though, as you answered the original question with a well sound rationale :) Thanks!

adinapoli avatar Nov 05 '14 07:11 adinapoli

You are very welcome!

Of course, postgresql-simple really should support the option of protocol-level parameters, which avoids some of these issues and avoids the need for escaping altogether, and allows e.g. integers and timestamps to be transmitted in binary formats which would be a significant performance win. But there are a couple of issues on that count as well; notably, protocol-level parameters are incompatible with issuing multiple SQL statements in a single request, for no good reason at all IMO.

(To go a little deeper, at the protocol level you can actually issue multiple statements with protocol-level parameters without serializing the request/respose cycle, but libpq's state machine doesn't support this. So we'd have to go native, and even then this could require a single parameter to be transmitted multiple times when it could instead be shared across statements; e.g. select $1 + $1 is possible with libpq, but select $1 + $2; select $1 + $3 is would require two transmissions of the parameter, even at the protocol level.)

The upshot is, due to this and other issues arising in other use cases, I'm pretty sure that basic access library that doesn't impose it's own view of how postgresql should be interacted with needs to support both protocol-level and interpolated parameters.

So the simplest way to preserve the existing API would be to add more query/execute functions, e.g. queryParams or whatnot.

lpsmith avatar Nov 05 '14 07:11 lpsmith

Actually, I think I'm going to leave this issue open for now, as this actually does bring up some issues that should be worked on... like re-implementing libpq's escape routines, improving documentation, and is indirectly relevant to some of the ideas floated in #120 and #36.

lpsmith avatar Nov 06 '14 00:11 lpsmith

Perfect :) Thanks again!

Alfredo

adinapoli avatar Nov 06 '14 08:11 adinapoli

I've been using these https://github.com/sopvop/postgresql-simple-dsl/blob/master/src/Database/PostgreSQL/Simple/Dsl/Escaping.hs for more than a year now, and haven't encountered any problems with escaping. Should I make a pull request?

sopvop avatar Nov 24 '16 09:11 sopvop