PetaPoco Backslashes in SQL identifiers breaks escaping & codegen outputs

\ in SQL identifier breaks outputs from T4 templates, e.g. "tab\le" becomes unrecognized escape sequence. Should be escaped or written out as verbatim, e.g. [TableName(@"dbo.tab\le")].

DatabaseProvider.EscapeTableName assumes tables to already be escaped if tableName.IndexOf('.') >= 0 -> table names with backslashes don't get escaped (e.g. "dbo.tab\le" should be "dbo.[tab\le]" for the SqlServerDatabaseProvider).

May 23 '17 07:05 jokokko

Yes, if the table name has a dot, it would be better to split the string at the dot, escape each part, and then put it back together.

Along with this, though, it would be helpful if EscapeSqlIdentifier were idempotent. Currently, it just wraps the identifier with the given start and end chars. It would be nice if we tested to see if those chars were already there before doing so, so that if a string which is already escaped gets passed in again, we don't wind up with two sets of escape chars.

@pleb What do you think? I'm a little concerned that this might be breaking code for some cases. I would suggest changing DatabaseProvider to something like this:

public virtual string EscapeSqlIdentifier(string sqlIdentifier) => EscapeSqlIdentifier(sqlIdentifier, '[', ']');

protected string EscapeSqlIdentifier(string sqlIdentifier, char startChar, char endChar)
{
    if (String.IsNullOrWhiteSpace(sqlIdentifier) || sqlIdentifier[0] != startChar)
        return $"{startChar}{sqlIdentifier}{endChar}";
    else 
        return sqlIdentifier;
}

Now the individual providers can do:

// MySQL
public override string EscapeSqlIdentifier(string sqlIdentifier) => EscapeSqlIdentifier(sqlIdentifier, '`', '`');

And now that EscapeSqlIdentifier() is idempotent, we can do:

public virtual string EscapeTableName(string tableName)
{
    var bits = tableName.Split('.');
    return String.Join(".", bits.Select(b => EscapeSqlIdentifier(b)));
}

// Inputs: "foo", "[foo]", "dbo.foo", "[dbo].foo", "[dbo].[foo]", "a.b.c"
/* Outputs:
[foo]
[foo]
[dbo].[foo]
[dbo].[foo]
[dbo].[foo]
[a].[b].[c]
*/

Oct 31 '18 15:10 asherber

We can try it in the dev channel. If we get feedback it's breaking code, we may have to push it to a configuration option. That said, I think checking the start char should mitigate the potential to break code - hopefully entirely.

Nov 01 '18 21:11 pleb

I'm having some second thoughts about this.

The original issue in the ticket actually refers to the generated SQL in T4 templates, which isn't relevant in v6.

I did make EscapeSqlIdentifier() idempotent, which I think is a good change.

The problem with EscapeTableName() is that table names themselves can contain periods -- the period is not necessarily a separator between server.database.schema.table. So we can't just split at periods and escape each bit. For example, we have no way of knowing whether foo.bar represents a table named foo.bar (which would be [foo.bar]) or a table named bar in a schema named foo (which would be [foo].[bar]). I think the point of the current logic is that if there's a period in the input table name, since we don't know what to do, we just assume the user has already done it correctly.

So I think I'm inclined to leave EscapeTableName() alone. Thoughts?

Nov 02 '18 03:11 asherber

FWIW, SqlKata doesn't have a foolproof way of handling this either. They break on periods and escape each part, which doesn't work if the string passed in is already escaped, or if the period is part of the table name.

Nov 02 '18 13:11 asherber

PetaPoco PetaPoco copied to clipboard

Backslashes in SQL identifiers breaks escaping & codegen outputs

PetaPoco
PetaPoco copied to clipboard