efcore icon indicating copy to clipboard operation
efcore copied to clipboard

SQLite: Lower-case Guid strings don't work in queries

Open Leon99 opened this issue 5 years ago • 25 comments

I use Microsoft.EntityFrameworkCore.Sqlite 3.1.1 in a testing project (with SqlServer used elsewhere). I expect it to be a (mostly) drop-in alternative. However, this doesn't work as expected with SQLite (but works fine with SqlServer and InMemory):

public IQueryable<Item> Query(Guid? tenantId, string? tag = null)
		{
			var entities = this.dbContext.Items
					.Include(a => a.Tags)
					.Where(a => tag == null || a.Tags.Any(at => at.TenantId == tenantId && at.Tag == tag))
               ...

entities is empty, while it's expected to contain some items. Upon checking a whole bunch of Guid-related issues here, the only workaround I found requires changing OnModelCreating (adding .HasConversion<string>() for the property helps), which is not acceptable since the same code works with other providers. Therefore, I conclude that it's a bug in the provider.

Tag.TenantId is Guid?. Database was created using context.Database.EnsureCreated();. GUIDs are formatted as 00000000-0000-0000-0000-000000000000 (as per https://docs.microsoft.com/en-us/dotnet/standard/data/sqlite/types).

Leon99 avatar Jan 21 '20 01:01 Leon99

@Leon99 I have not been able to reproduce this--see my code below. Please attach a small, runnable project or post a complete code listing that reproduces the behavior you are seeing.

public class Item
{
    public int Id { get; set; }
    public ICollection<ItemTag> Tags { get; set; }
}

public class ItemTag
{
    public int Id { get; set; }
    public Guid TenantId { get; set; }
    public string Tag { get; set; }
}

public class BloggingContext : DbContext
{
    private readonly ILoggerFactory Logger 
        = LoggerFactory.Create(c => c.AddConsole());//.SetMinimumLevel(LogLevel.Debug));

    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
    {
        optionsBuilder
            .UseLoggerFactory(Logger)
            .EnableSensitiveDataLogging()
            .UseSqlite("DataSource=C:\\Stuff\\Test.db");
    }

    public IQueryable<Item> Query(Guid? tenantId, string tag = null)
    {
        return Items
            .Include(a => a.Tags)
            .Where(a => tag == null || a.Tags.Any(at => at.TenantId == tenantId && at.Tag == tag));
    }
    
    public DbSet<Item> Items { get; set; }
}

public class Program
{
    public static async Task Main()
    {
        var tenantId = Guid.NewGuid();
        using (var context = new BloggingContext())
        {
            context.Database.EnsureDeleted();
            context.Database.EnsureCreated();

            context.Add(new Item
            {
                Tags = new List<ItemTag>
                {
                    new ItemTag {Tag = "X", TenantId = tenantId}
                }
            });

            context.SaveChanges();
        }

        using (var context = new BloggingContext())
        {
            var results = context.Query(tenantId, "X").ToList();
            Console.WriteLine(results.Count);
        }
    }
}

ajcvickers avatar Jan 21 '20 23:01 ajcvickers

@ajcvickers thanks for your prompt response. Your example helped me to find what causes the issue. Guids in my DB were entered manually and represented in lowercase. As per https://tools.ietf.org/html/rfc4122#section-3 and https://www.itu.int/rec/dologin_pub.asp?lang=e&id=T-REC-X.667-201210-I!!PDF-E&type=items, GUIDs are meant to be output as lower case characters. .NET's Guid.ToString() follows that and outputs lower case characters. SQLite provider violates that by persisting them in upper case and converting query parameters to upper case. Is there a reason to do that? Is there a setting that I could use to avoid converting all my Guids to upper case when putting them into a test DB?

Leon99 avatar Jan 22 '20 01:01 Leon99

Hi, a few ideas

  1. You could add an extra normalization column containing GUID in upper-case while also having a column with lowercase one (wasted space in db).

  2. Use EF raw query with upper operator (https://www.techonthenet.com/sqlite/functions/upper.php). (could be efficient)

  3. Do conversion in your app, if you have a large number of such guids in string type, you can use unsafe way as last resort, this rutine will do conversion to uppercase without any allocation (rewriting original string).

    public unsafe string ToUpperCaseInvariant(string s)
    {
        var handle = System.Runtime.InteropServices.GCHandle.Alloc(s, System.Runtime.InteropServices.GCHandleType.Pinned);
        try
        {
            var strPtr = handle.AddrOfPinnedObject();
            char* str = (char*)strPtr.ToPointer();

            // To uppercase by flipping bits (only lowercase strings will be converted)
            for (int i = 0; i < s.Length; i++)
            {
                if ((str[i] > 96) && (str[i] < 123)) str[i] ^= (char)0x20;
            }

            return s;
        }
        finally
        {
            handle.Free();
        }
    }

xtremertx avatar Jan 22 '20 09:01 xtremertx

Thanks @xtremertx, but none of those workarounds will work for me - it's a testing project and it's supposed to test the code/DB structure that works with Sql Server.

Leon99 avatar Jan 22 '20 10:01 Leon99

Can you change the column collation to NOCASE?

CREATE TABLE ItemTag (
    TenantId TEXT NOT NULL COLLATE NOCASE
);

bricelam avatar Jan 23 '20 19:01 bricelam

Looks like SQLite 3.31.0 adds some UUID functions that output lower-case strings. They also add a standard BLOB format; I wonder if it's compatible with Guid.ToByteArray()

We should re-consider our defaults (again) in 5.0

bricelam avatar Jan 23 '20 19:01 bricelam

@bricelam I guess I could try the NOCASE collation, but I'm afraid that would incur a performance penalty as case-insensitive search usually slower than case-sensitive.

Leon99 avatar Jan 23 '20 23:01 Leon99

Notes from team discussion:

  • We should support the new SQLite GUIDs in ADO.NET and EF Core
  • We should investigate mitigations for any breaking change. For example:
    • Auto-conversion of old values using a UDF?
    • A tool/command that will update a database to the new format?
    • Configuration of which mapping to use, probably at the EF level?

ajcvickers avatar Jan 24 '20 20:01 ajcvickers

Correction: SQLite 3.31.0 added an optional extension with UUID functions. It is not enabled by default in SQLite.

bricelam avatar Sep 11 '20 19:09 bricelam

We recently updated from EF Core 2.X to 3.1 (3.1.9 as of now) and I ran into an issue I think is related to this because the suggestion to use COLLATE NOCASE here worked for me also.

I was updating a table that had the primary key set as a GUID in my entity and a TEXT field in the database. Whenever I performed an update on the entity I kept getting:

Database operation expected to affect 1 row(s) but actually affected 0 row(s). Data may have been modified or deleted since entities were loaded. See http://go.microsoft.com/fwlink/?LinkId=527962 for information on understanding and handling optimistic concurrency exceptions.

So, I used this exception on the SaveChanges()

catch (DbUpdateConcurrencyException ex) { foreach (var entry in ex.Entries) { if (entry.Entity is vwActivity_Sync) { var proposedValues = entry.CurrentValues; var databaseValues = entry.GetDatabaseValues();

What I discovered is that the databaseValues was null which if what I read was correct, is because it did not find the entity in the database, even though I used an EF query to retrieve the entity to start with, but based on a different, non-key field, which also happened to be a GUID and the query worked, without the queried column being set to COLLATE NOCASE.

So, based on the suggestion above I changed the key field GUID column to be COLLATE NOCASE and the exception was no longer thrown, and the update occurred as expected.

I am not 100% sure why this was a problem in my case, but I am glad I ran across this issue and the COLLATE NOCASE resolved my issue.

@ajcvickers or @bricelam, I gather from the entry above that this issue was not addressed in EF 5.0 but is slated to be looked at for 6.0? Have you decided what will be done yet? In the meantime, should any key GUID field and/or GUID we may query on be configured as COLLATE NOCASE? Wondering what best / required practice is here before we start changing things.

wjvii avatar Nov 09 '20 01:11 wjvii

@wjvii Were you hitting this breaking change in 3.1? Or did your database actually have lower-case Guid values? If so, do you know how they were inserted?

bricelam avatar Nov 09 '20 21:11 bricelam

To mitigate future breaking changes in this area, I'd recommend converting all the database values to lowercase and using a value converter.

UPDATE MyTable
SET GuidColumn = lower(GuidColumn);
modelBuilder
    .Entity<MyEntity>()
    .Property(e => e.GuidProperty)
    .HasConversion(
        g => g.ToString(),
        s => new Guid(s));

bricelam avatar Nov 09 '20 22:11 bricelam

@bricelam Using the DB Browser for SQLite it shows all the GUID values in the database as upper case. We did change all the values from BLOB to TEXT but in my case, this is a new database (no data before I started testing with it) and EF was inserting the values into the TEXT columns.

wjvii avatar Nov 09 '20 22:11 wjvii

Interesting, COLLATE NOCASE shouldn't be required for that.

bricelam avatar Nov 09 '20 22:11 bricelam

Could you provide us a simplified repro (in a new issue) so we can investigate?

bricelam avatar Nov 09 '20 22:11 bricelam

@bricelam Let me clarify, after I had the data inserted, I went to update a few records via EF and that is when I ran into the issue where it kept saying it was expecting to update a record but updated none. That is when I checked things out with the DbUpdateConcurrencyException and discovered that GetDatabaseValues returned null, until I did the COLLATE NOCASE.

Happy to try and create a simplified case. Might be a few days but I will update here with the results.

wjvii avatar Nov 09 '20 22:11 wjvii

@bricelam I wanted to give you a quick update. I am still working to create a simplified repro but haven't yet figured out the combination that causes the issue. I have tried quite a few of the combinations I thought would re-create it but something else less obvious in our production code must be contributing. I will continue to try and track it down.

wjvii avatar Nov 13 '20 01:11 wjvii

@bricelam I would also like to add, if you create a Guid in SQLite, it will be created without the dashes also. It's still a valid Guid and you can still parse it, but it causes errors with EF and isn't recognized as a Guid.

I had to create extensions methods in the API to bypass this.

I experienced this creating records in SQLite and using Microsoft Azure Offline Sync trying to parse into the api. All Guids created in SQLite would error, until I figured out SQLite represents Guids as lowercase, no dashes.

nick5454 avatar May 26 '21 20:05 nick5454

I encountered this today. I have await dbSet.FirstOrDefaultAsync(u => u.Id.ToString() == userId); which does not work, but await dbSet.AsEnumerable().FirstOrDefaultAsync(u => u.Id.ToString() == userId); does work, because of the reasons above.

I run Microsoft.EntityFrameworkCore.Sqlite 6.0.5

delasource avatar May 22 '22 15:05 delasource

I encountered this today. I have await dbSet.FirstOrDefaultAsync(u => u.Id.ToString() == userId); which does not work, but await dbSet.AsEnumerable().FirstOrDefaultAsync(u => u.Id.ToString() == userId); does work, because of the reasons above.

I run Microsoft.EntityFrameworkCore.Sqlite 6.0.5

What I did is u => u.Id == Guid.Parse(Id) and that works

nick5454 avatar May 23 '22 13:05 nick5454

I encountered this today. I have await dbSet.FirstOrDefaultAsync(u => u.Id.ToString() == userId); which does not work, but await dbSet.AsEnumerable().FirstOrDefaultAsync(u => u.Id.ToString() == userId); does work, because of the reasons above.

I run Microsoft.EntityFrameworkCore.Sqlite 6.0.5

But wouldn't the .AsEnumerable has a performance impact?

michaelmittermair avatar Oct 18 '22 19:10 michaelmittermair

I encountered this today. I have await dbSet.FirstOrDefaultAsync(u => u.Id.ToString() == userId); which does not work, but await dbSet.AsEnumerable().FirstOrDefaultAsync(u => u.Id.ToString() == userId); does work, because of the reasons above. I run Microsoft.EntityFrameworkCore.Sqlite 6.0.5

What I did is u => u.Id == Guid.Parse(Id) and that works

I have a paramater id (type of Guid) and using this is failing: var dbPackage = Context.WarehousePackage.AsNoTracking() .FirstOrDefault(m => m.Id == id);

But this is working: var dbPackage = Context.WarehousePackage.AsNoTracking() .FirstOrDefault(m => m.Id == Guid.Parse(id.ToString()));

What's the difference between the id as GUID and the id.ToString and than parsed to a Guid again?

michaelmittermair avatar Oct 18 '22 19:10 michaelmittermair

I encountered this today. I have await dbSet.FirstOrDefaultAsync(u => u.Id.ToString() == userId); which does not work, but await dbSet.AsEnumerable().FirstOrDefaultAsync(u => u.Id.ToString() == userId); does work, because of the reasons above. I run Microsoft.EntityFrameworkCore.Sqlite 6.0.5

What I did is u => u.Id == Guid.Parse(Id) and that works

I have a paramater id (type of Guid) and using this is failing: var dbPackage = Context.WarehousePackage.AsNoTracking() .FirstOrDefault(m => m.Id == id);

But this is working: var dbPackage = Context.WarehousePackage.AsNoTracking() .FirstOrDefault(m => m.Id == Guid.Parse(id.ToString()));

What's the difference between the id as GUID and the id.ToString and than parsed to a Guid again?

This has been a while but what I found is that sqlite represents guids differently ( no dashes ) and EF uses dashes even though both are valid guids internally.

@michaelmittermair this also may explain the AsEnumerable() working, because the code may translate it to a EF represented Guid. However, I have not tested this in a very long time.

nick5454 avatar Oct 19 '22 13:10 nick5454

Comparing two Guid types is handled fine. I personally like to have Guid.TryParse on the outside of that line and have invalid formats checked beforehand. (Guid.Parse will throw)

Yes, AsEnumerable (even though it supposedly "streams" the data) is an impact on performance on large result sets, which you obviously have, since you need to query all elements of your DbSet. (according to documentation)

delasource avatar Oct 19 '22 13:10 delasource

@ajcvickers thanks for your prompt response. Your example helped me to find what causes the issue. Guids in my DB were entered manually and represented in lowercase. As per https://tools.ietf.org/html/rfc4122#section-3 and https://www.itu.int/rec/dologin_pub.asp?lang=e&id=T-REC-X.667-201210-I!!PDF-E&type=items, GUIDs are meant to be output as lower case characters. .NET's Guid.ToString() follows that and outputs lower case characters. SQLite provider violates that by persisting them in upper case and converting query parameters to upper case. Is there a reason to do that? Is there a setting that I could use to avoid converting all my Guids to upper case when putting them into a test DB?

Thanks, this helped me track down the same problem

rburnham-def avatar May 16 '24 03:05 rburnham-def