API proposal: custom interpolated string handler
Right now, the preferred way of passing args to Dapper requires a second parameter, for example:
string name = ...
int id = ...
conn.Execute("""
update customer
set name = @name
where id = @id
""", new { name, id });
This works; Dapper (vanilla) has code to emit custom per-type code to extract the parameters, and DapperAOT has additional code to pre-gen that as AOT and better validation (mismatched args etc). Additional per-value parameter settings are awkward, but overall: it works.
However!
There is also a possibility to use a custom "interpolated string handler". I have a fully working prototype that allows the following:
string name = ...
int id = ...
conn.Execute($"""
update customer
set name = @{name}
where id = @{id}
""");
This is not a string, and is zero alloc, fully parameterized (SQLi safe), etc. Note that the leading @ (or : etc) is primarily because ADO.NET does not directly expose the parameter token of a given connection, but IMO it helps make it very clear what is going on.
Under the hood, this emits fundamentally the same SQL, even using the argument-expression feature to generate sensible parameter names where possible (name and id in this case). Additionally, from .NET 9 the "alt-lookup" feature of dictionaries is used to avoid allocating a new string per usage. There is zero runtime ref-emit etc needed for packing parameters - we basically trick the C# compiler into doing that work for us!
We could also potentially use the optional format parameters to convey other information, for example:l {name:1000} could set the .Size to 1000, and {qty:P=5,S=3} could set the .Precision and .Scale.
Genuine question: is this an improvement? Is this worth adding new overloads of some core methods? Is this technically nice but not worth the mental addition? Or is this hell-yeah-lets-do-this?
Separately, an analyzer in AOT is proposed to spot interpolated string uses that are susceptible to SQLi (i.e. the type is string); I would also propose that we start shipping Dapper.Advisor inside Dapper, to light up all those checks by default.
Genuine question: is this an improvement? Is this worth adding new overloads of some core methods? Is this technically nice but not worth the mental addition? Or is this hell-yeah-lets-do-this?
In my opinion this is hell-yeah-lets-do-this 😄
I think this would improve Dapper adoption, as the getting started experience would be even easier, and more intuitive.
...I would also propose that we start shipping Dapper.Advisor inside Dapper, to light up all those checks by default.
I think that is a good idea,
I use
var p = new DynamicParameters();
to have my args at the top of the Sql string, and pass "p" in the dapper command.
But yes, i will change for your new proposition.
I use
var p = new DynamicParameters();to have my args at the top of the Sql string, and pass "p" in the dapper command.
DynamicParameters is literally the worst option available; more expensive, and impossible to verify in an analyzer. I'd love to understand why you're using that option. If it is "because nobody said not to", then: that's really useful feedback that can help us prioritize docs and guidance. The main intended use-case for DynamicParameters is when the SQL is constructed on-the-fly and thus no fixed parameters make sense.
The main intended use-case for DynamicParameters is when the SQL is constructed on-the-fly and thus no fixed parameters make sense.
Don't forget output params too :) I have lots of those.
Don't forget output params too :) I have lots of those.
Yep, absolutely.
DynamicParameters is literally the worst option available; more expensive, and impossible to verify in an analyzer. I'd love to understand why you're using that option. If it is "because nobody said not to", then: that's really useful feedback that can help us prioritize docs and guidance. The main intended use-case for DynamicParameters is when the SQL is constructed on-the-fly and thus no fixed parameters make sense.
" If it is "because nobody said not to" => Yeah, kind of, even worse:
That's good I said that here, because I use Dapper for a long time... And since I begun to use it (before async was part of the lib), I read some random tutorials. I use this Dynamic parametter thing to pass params even in a small generator I use to implement basic CRUD for small projects.
I never changed since then... never to late, to know that I was doing it wrong.
I will use the normal way and the new way if you implement it. :)
EDIT: maybe it was at the begining, I don't remember if we can get the inserted "id" back without an output param... and I used that way as a standard for all the stuff...
EDIT2: or maybe because I used a lot of sp in the past...
The main problem is that someone may already have used $"..." syntax to format multiple pieces of a query into one.
So when introducing this awesome API I would evaluate using a different signature or at least be very clear that this is a breaking change.
One thing to consider is that Npgsql considers named parameters to be legacy as they require runtime processing (conversion into positional). Using positional parameters ($1, $2, etc) is currently impossible (AFAIK) in Dapper, so if this proposal could also address this that would be great 👍
I have an incomplete spike in AOT that rewrites parameterized into positional using the batch API.
---------- Forwarded message --------- From: JulianRooze @.> Date: Sun, 15 Dec 2024, 11:19 Subject: Re: [DapperLib/Dapper] API proposal: custom interpolated string handler (Issue #2136) To: DapperLib/Dapper @.> Cc: Author @.>, Comment @.>, Assign @.>, Subscribed < @.>
One thing to consider is that Npgsql considers named parameters to be legacy as they require runtime processing (conversion into positional). Using positional parameters ($1, $2, etc) is currently impossible (AFAIK) in Dapper, so if this proposal could also address this that would be great 👍
— Reply to this email directly, view it on GitHub https://github.com/DapperLib/Dapper/issues/2136#issuecomment-2543833281 or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAEHMGCAQH4O73BMCPYAWL2FVQU5BFKMF2HI4TJMJ2XIZLTS2BKK5TBNR2WLJDUOJ2WLJDOMFWWLO3UNBZGKYLEL5YGC4TUNFRWS4DBNZ2F6YLDORUXM2LUPGBKK5TBNR2WLJDUOJ2WLJDOMFWWLLTXMF2GG2C7MFRXI2LWNF2HTAVFOZQWY5LFUVUXG43VMWSG4YLNMWVXI2DSMVQWIX3UPFYGLAVFOZQWY5LFVE2TSOJXHE2TCNBSURXGC3LFVFUGC427NRQWEZLMQKSXMYLMOVS2SOBVHA3TSOBYHEZ2I3TBNVS2S2DBONPWYYLCMVWIFJLWMFWHKZNKGIYDGNBWGEYDKMRVURXGC3LFVFUGC427NRQWEZLMVRZXKYTKMVRXIX3UPFYGLLCJONZXKZKDN5WW2ZLOOSTHI33QNFRXHFMCUR2HS4DFVJZGK4DPONUXI33SPGSXMYLMOVS2OMJWGEZTGNBVQKSHI6LQMWSWS43TOVS2K5TBNR2WLKRSG42DAMBQGQ4DGNMCUR2HS4DFUVWGCYTFNSSXMYLMOVS2SNJZHE3TSNJRGQZIFJDUPFYGLJLMMFRGK3FFOZQWY5LFVE4DKOBXHE4DQOJTQKSHI6LQMWSWYYLCMVWKK5TBNR2WLKRSGAZTINRRGA2TENNHORZGSZ3HMVZKMY3SMVQXIZI . You are receiving this email because you authored the thread.
Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub .
Hey @mgravell 👋
Yeah, interpolated strings can indeed provide a really nice experience for parameterizing SQL. In EF we do this too for APIs that accept SQL, though we (for now) only use FormattableString, rather than custom interpolated string handlers.
The one thing I'd recommend, is to think long and hard on the consequences of having two method overloads with the same name (Execute), one accepting a regular string and another an interpolated one... EF has had a rocky history here, where we initially did exactly that, and then ended up doing a breaking change to split them to two names, to make it 100% clear which kind of interpolation is taking place (see https://github.com/dotnet/efcore/issues/10996). For example, an unsuspecting user can start off with the safe, interpolated version:
conn.Execute($"""
update customer
set name = @{name}
where id = @{id}
""");
... and then extract the string out for some reason:
var sql = $"""
update customer
set name = @{name}
where id = @{id}
""";
conn.Execute(sql);
This looks OK, but while the 1st fragment resolves to a FormattableString-based overload (that's safe against SQL injection), the 2nd one isn't. As this is all closely related to security, I'd definitely err on the conservative side here.
FWIW we ended up with the plainly-named, "default" FromSql being the FormattableString overload, and FromSqlRaw being the overload that accepts a string (if I had my way it would be called FromSqlUnsafe or FromSqlDangerous). This way, the unsuspecting user likely ends up just using FromSql, and gets SQL injection safety by default; the downside is that they have to use a $-string even when they don't need to interpolate anything, otherwise things don't compile, which is a bit odd.
See https://github.com/dotnet/efcore/issues/10996 for lots of EF-side discussion on this.
That's exactly what I was trying to mention above.
Instead of changing the Dapper method names, I would suggest something like:
connection
.FromSqlFormat($"...")
.QueryFirstOrDefault<Animal>();
@albyrock87 that split between command and execution is something that DapperAOT already uses behind the scenes (not as the primary API), with a Command<T> that represents this intermediate layer, with that having all the Execute, Query<T> etc methods. Something similar here would make a decent amount of sense.
Perhaps, then, with the AOT library in scope (whether or not AOT is being used):
public static class DapperAotExtensions
{
public static Command InterpolatedCommand(this DbConnection connection, ref SqlBuilder sql,
int timeout = 0);
public static Command InterpolatedCommand(this DbTransaction transaction, ref SqlBuilder sql,
int timeout = 0);
}
public struct InterpolatedArgs { } // some wrapper around all the thing we captured
public struct Command // new API similar to Command<T> but for the "no args, or args already supplied via interpolation" scenario
{
public int Execute(...); // etc - like Command<TArgs>, but minus the TArgs parameter
}
which would allow:
var animal = connection
.InterpolatedCommand($"""
select *
from Animals
where Genus = @{genus}
and Region = @{region}
""")
.QueryFirstOrDefault<Animal>();
As for the name; naming is hard. Just thinking aloud.
And to be explicit, there would not be a string overload, so this would fail to compile:
string sql = $"""
select *
from Animals
where Genus = @{genus}
and Region = @{region}
""";
var animal = connection
.InterpolatedCommand(sql)
.QueryFirstOrDefault<Animal>();
I think this is moving in the right direction, adds safety, avoids confusion and avoids adding 20+ methods to the public API.
@roji great input as always, thanks; as an aside: if EF wants, note that custom interpolated string handlers take overload priority over both string and FormattableString, so if EF wanted to, you could add your own lower-alloc more specialized implementation:
using System.Runtime.CompilerServices;
var obj = new Foo();
int id = 42;
obj.Bar($"""
select 1
from Orders
where Id={id}
""");
// output: "The new thing"
public class Foo
{
public void Bar(FormattableString value) => Console.WriteLine("The old thing");
public void Bar(ref CustomInterpolatedStringHandler value) => Console.WriteLine("The new thing");
}
[InterpolatedStringHandler]
public ref struct CustomInterpolatedStringHandler(int x, int y)
{
public void AppendLiteral(string value) { }
public void AppendFormatted<T>(T value) { }
}
@mgravell to be honest I was thinking about CreateCommandInterpolated too so it's intuitive.. but then I looked for something shorter.
Also, unless you plan to move timeout and other command-related switches, it's strange to name it CreateCommand when it only contains the SQL and parameters.
Naming things is so hard 🤣
Also, unless you plan to move
timeoutand other command-related switches, it's strange to name itCreateCommandwhen it only contains the SQL and parameters.
You're right, that would be an optional on the end of the CreateNamingIsHard(...) method; Command<> already captures timeout. Clarified, including renaming to InterpretedCommand for parity with the existing connection.Command(...) API.
I do wonder whether .FromSql or .FromFormatted is better than .InterpretedCommand though...
Side note: we would need to update to code-gen to consider use of the AOT API in addition to the vanilla Dapper API; that's fine, we should do that anyway! All that means is that under the bonnet when you do:
var animal = connection
.InterpolatedCommand(sql)
.QueryFirstOrDefault<Animal>();
behind the scenes if possible we emit a custom parser for reading Animal and we use "interceptors" to rewrite your .QueryFirstOrDefault<Animal>(); to a .QueryFirstOrDefault<Animal>(rowFactory: GeneratedAnimalRowFactory.Instance);
Another advantage of using a completely separate API is that it avoids a current compiler tendency to prefer string versions when all the formatted tokens are string and const; see this example for more context, and the corresponding Roslyn ticket
as an aside: if EF wants, note that custom interpolated string handlers take overload priority over both string and FormattableString, so if EF wanted to, you could add your own lower-alloc more specialized implementation:
Yeah, there's an issue somewhere tracking looking at [InterpolatedStringHandler]... It has never bubbled up in terms of importance... In terms of overload resolution, yeah, custom interpolated string handler over Formattable string is fine (both would be conceptually the same API, with one just being a more efficient variant) - the string overload I'd keep very separate no matter what, for the reasons above.
I spotted 2 libraries that already does that with Dapper: https://github.com/Drizin/InterpolatedSql/tree/main/src/InterpolatedSql.Dapper and https://github.com/mishael-o/Dapper.SimpleSqlBuilder It might worth checking them out before you decide to fix your API.
IMO it would be a great addition to Dapper. Just make sure it stays very simple to use because this is the reason why we use Dapper in the first place.