LiteDB icon indicating copy to clipboard operation
LiteDB copied to clipboard

[BUG] Complex queries on nested documents only ever return first result set, even after subsequent queries with different criteria

Open micahmo opened this issue 4 years ago • 9 comments

Version Which LiteDB version/OS/.NET framework version are you using. 5.0.9/Windows 10/.NET Framework 4.8

Describe the bug Complex queries on nested documents only ever return first result set, even after subsequent queries with different criteria.

I was experiencing the issue in my project, and decided to distill it down to the minimum code needed to reproduce. It is pasted below. In the PerformTest() method, the second test fails. It returns the result of the first query always. (Even if I close and re-open the database.)

Code to Reproduce

public class Parent
{
	[BsonId]
	public int Id { get; set; }

	[BsonRef("child")]
	public List<Child> Children { get; set; }
}

public class Child
{
	[BsonId]
	public int Id { get; set; }

	public string Name { get; set; }
}

void Main()
{
	ResetTest();
	InitializeTest();
	PerformTest();
}

private static string DatabasePath = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "Test.db");

private void ResetTest()
{
	string databasePath = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "Test.db");
	if (File.Exists(databasePath))
	{
		File.Delete(databasePath);
	}
}

private void InitializeTest()
{
	LiteDatabase database = new LiteDatabase(DatabasePath);
	ILiteCollection<Parent> parentCollection = database.GetCollection<Parent>("parent");
	ILiteCollection<Child> childCollection = database.GetCollection<Child>("child");

	Child firstChild = new Child { Name = "TestA" };
	Child secondChild = new Child { Name = "TestB" };
	Child thirdChild = new Child { Name = "Third" };

	childCollection.Insert(firstChild);
	childCollection.Insert(secondChild);
	childCollection.Insert(thirdChild);

	Parent firstParent = new Parent { Id = 1, Children = new List<Child> { firstChild, secondChild } };
	Parent secondParent = new Parent { Id = 2, Children = new List<Child> { thirdChild } };

	parentCollection.Insert(firstParent);
	parentCollection.Insert(secondParent);
	
	database.Dispose();
}

private void PerformTest()
{
	LiteDatabase database = new LiteDatabase(DatabasePath);
	ILiteCollection<Parent> parentCollection = database.GetCollection<Parent>("parent").Include(p => p.Children);

	ILiteQueryable<Parent> query = parentCollection.Query().Where(p => p.Children.Where(c => c.Name.Contains("Test")).Any());
	Debug.Assert(query.First().Id == 1, $"Did not find correct ID of parent where at least one child's name contains 'Test'. ID is {query.First().Id}.");
	
	query = parentCollection.Query().Where(p => p.Children.Where(c => c.Name.Contains("Third")).Any());
	Debug.Assert(query.First().Id == 2, $"Did not find correct ID of parent where at least one child's name contains 'Third'. ID is {query.First().Id}.");
	
	database.Dispose();
}

Output

Fail: Did not find correct ID of parent where at least one child's name contains 'Third'. ID is 1.

Expected behavior Each query returns the correct results, as a direct query via the Studio application does. Specifically, the second test should pass, where the resulting object's ID is 2, because the second Parent contains the Child with "Third" in the Name.

Screenshots/Stacktrace Inspecting the ILiteQueryable object, I can see that the resulting queries will look something like this.

  • First test: SELECT $ from parent INCLUDE $.Children WHERE COUNT(FILTER($.Children[*]=>@.Name LIKE '%Test%'))>0
  • Second test: SELECT $ from parent INCLUDE $.Children WHERE COUNT(FILTER($.Children[*]=>@.Name LIKE '%Third%'))>0

Running these queries through the Studio application returns the correct results. But the same queries as generated and run via the C# API do not.

2020-12-11 15_23_56-LiteDB Studio (v1 0 2 0)

2020-12-11 15_24_04-LiteDB Studio (v1 0 2 0)

Additional context N/A

micahmo avatar Dec 11 '20 20:12 micahmo

Looks like hard-coding the where clause works as expected. The following test passes.

Parent parent = parentCollection.Find("COUNT(FILTER($.Children[*]=>@.Name LIKE '%Test%'))>0").First();
Debug.Assert(parent.Id == 1, $"Did not find correct ID of parent where at least one child's name contains 'Test'. ID is {parent.Id}.");

parent = parentCollection.Find("COUNT(FILTER($.Children[*]=>@.Name LIKE '%Third%'))>0").First();
Debug.Assert(parent.Id == 2, $"Did not find correct ID of parent where at least one child's name contains 'Third'. ID is {parent.Id}.");

This is a viable workaround for now, but it would be nice to be able to use the ILiteQueryable and the C# query syntax.

micahmo avatar Dec 11 '20 20:12 micahmo

@micahmo

There lots of issues in your code and in your question, but I will try to mention the most important things.

  • First Please edit your Issue, code snippet part in markdown I think you use

``` -your-code-- ``` but you should use ```C# -your-code-- ```

to support syntax highlight

  • Second this is the solution for your issue
  public class Issue1897
    {
        class Parent
        {
            [BsonId]
            public int Id { get; set; }

            [BsonRef("child")]
            public List<Child> Children { get; set; }
        }

        class Child
        {
            [BsonId]
            public int Id { get; set; }

            public string Name { get; set; }
        }
        
        public static void Run()
        {
            LiteDatabase database = new LiteDatabase(new MemoryStream());
            ILiteCollection<Parent> parentCollection = database.GetCollection<Parent>("parent");
            ILiteCollection<Child> childCollection = database.GetCollection<Child>("child");

            Child firstChild = new Child { Name = "TestA" };
            Child secondChild = new Child { Name = "TestB" };
            Child thirdChild = new Child { Name = "Third" };

            childCollection.Insert(firstChild);
            childCollection.Insert(secondChild);
            childCollection.Insert(thirdChild);

            Parent firstParent = new Parent { Id = 1, Children = new List<Child> { firstChild, secondChild } };
            Parent secondParent = new Parent { Id = 2, Children = new List<Child> { thirdChild } };

            parentCollection.Insert(firstParent);
            parentCollection.Insert(secondParent);
            
            var result = parentCollection.Include(p => p.Children).Query().Where(p =>  p.Children[0].Name.Contains("rd")).ToList();
   	    var result2 = parentCollection.Include(p => p.Children).Query().Where(p =>  p.Children.Select(c => c.Name).Any(n => n.Contains("rd"))).ToList();
            
        }
    }

result is the same as result2

So:

  • When you do tests on the logic, please use MemoryStream
  • You must include referenced entities .Include(p => p.Children)
  • p.Children[0] does not mean the first index, it is just a way to access nested objects
  • Please read the docs, I know the docs leak lots of information, but you can also read the UnitTests until we find a solution to enhance the docs
  • Also there are lots of information not all are valid, but take it as a hint, because it is for the old version in LiteDBv4-Wiki

AlBannaTechno avatar Dec 24 '20 12:12 AlBannaTechno

Hi @AlBannaTechno! Thanks for the response!

First Please edit your Issue, code snippet part in markdown

Thanks for the tip! I updated my original comments to include C# syntax highlighting.

Second this is the solution for your issue

Unfortunately, this does not solve the issue. You've just showed two different ways to get the same element (a Parent who has at least one Child whose name contains rd). However, my problem is that I cannot run such queries multiple times with different criteria and get different results.

If I take your exact code and change the last line from...

var result2 = parentCollection.Include(p => p.Children).Query().Where(p => p.Children.Select(c => c.Name).Any(n => n.Contains("rd"))).ToList();

to

var result2 = parentCollection.Include(p => p.Children).Query().Where(p => p.Children.Select(c => c.Name).Any(n => n.Contains("Test"))).ToList();

I would expect result to hold Parent with id 2 (because it has a child called Third), and I would expect result2 to hold Parent with id 1 because it has a child (actually 2) whose name contains Test.

I think you've missed the issue, but I was able to use your code to reproduce it, so it helps to reinforce that it is a real bug. :-)

When you do tests on the logic, please use MemoryStream

I understand that this makes testing much easier (it's non-destructive to the file system). In my case, I was unsure if reading/writing the file could have caused the bug, so I wanted my sample project to reflect my real project exactly. (I know developers hate when a bug has different repro steps than the real issue!) Either way, I used your code with the MemoryStream and reproduced the same bug.

You must include referenced entities .Include(p => p.Children)

If you look at my original sample, you'll notice I did this when first creating the ILiteCollection<Parent> (second line in PerformTest()).

p.Children[0] does not mean the first index, it is just a way to access nested objects

I'm not convinced of this... But it shouldn't matter when using ILiteQueryable. The whole point is to use LINQ-like syntax, so the Where() combined with the Any() should have a similar effect.

Please read the docs

Absolutely! Always good advice! I scoured the docs before posting my issue to ensure that I wasn't missing something.

micahmo avatar Dec 24 '20 16:12 micahmo

@micahmo I can not figure the issue you try to mention, Please look at the next snippet, and explain what the result you expect,

Two Next Tests State Is: Success

 public class Issue1897
    {
        class Parent
        {
            [BsonId]
            public int Id { get; set; }

            [BsonRef("child")]
            public List<Child> Children { get; set; }
        }

        class Child
        {
            [BsonId]
            public int Id { get; set; }

            public string Name { get; set; }
        }
        
        [Fact]
        public void ShouldResolveParentsBasedOnReferencedChildrenCriteriaUsingMemoryStream()
        {
            LiteDatabase database = new LiteDatabase(new MemoryStream());
            ILiteCollection<Parent> parentCollection = database.GetCollection<Parent>("parent");
            ILiteCollection<Child> childCollection = database.GetCollection<Child>("child");

            Child firstChild = new Child { Name = "TestA" };
            Child secondChild = new Child { Name = "TestB" };
            Child thirdChild = new Child { Name = "Third" };

            childCollection.Insert(firstChild);
            childCollection.Insert(secondChild);
            childCollection.Insert(thirdChild);

            Parent firstParent = new Parent { Id = 1, Children = new List<Child> { firstChild, secondChild } };
            Parent secondParent = new Parent { Id = 2, Children = new List<Child> { thirdChild } };

            parentCollection.Insert(firstParent);
            parentCollection.Insert(secondParent);
            
            // var result = parentCollection.Include(p => p.Children).Query().Where(p =>  p.Children[0].Name.Contains("rd")).ToList();
            
            var result2 = parentCollection.Include(p => p.Children).Query().Where(p =>  p.Children.Select(c => c.Name).Any(n => n.Contains("rd"))).ToList();

            result2.Should().HaveCount(1);
            result2[0].Id.Should().Be(2);
            
            var result3 = parentCollection.Include(p => p.Children).Query().Where(p => p.Children.Select(c => c.Name).Any(n => n.Contains("Test"))).ToList();
            result3.Should().HaveCount(1);
            result3[0].Id.Should().Be(1);
        }
        
        [Fact]
        public void ShouldResolveParentsBasedOnReferencedChildrenCriteriaUsingFileSystem()
        {
            LiteDatabase database = new LiteDatabase("./test1.ldb");
            ILiteCollection<Parent> parentCollection = database.GetCollection<Parent>("parent");
            ILiteCollection<Child> childCollection = database.GetCollection<Child>("child");

            Child firstChild = new Child { Name = "TestA" };
            Child secondChild = new Child { Name = "TestB" };
            Child thirdChild = new Child { Name = "Third" };

            childCollection.Insert(firstChild);
            childCollection.Insert(secondChild);
            childCollection.Insert(thirdChild);

            Parent firstParent = new Parent { Id = 1, Children = new List<Child> { firstChild, secondChild } };
            Parent secondParent = new Parent { Id = 2, Children = new List<Child> { thirdChild } };

            parentCollection.Insert(firstParent);
            parentCollection.Insert(secondParent);
            
            // var result = parentCollection.Include(p => p.Children).Query().Where(p =>  p.Children[0].Name.Contains("rd")).ToList();
            
            var result2 = parentCollection.Include(p => p.Children).Query().Where(p =>  p.Children.Select(c => c.Name).Any(n => n.Contains("rd"))).ToList();

            result2.Should().HaveCount(1);
            result2[0].Id.Should().Be(2);
            
            var result3 = parentCollection.Include(p => p.Children).Query().Where(p => p.Children.Select(c => c.Name).Any(n => n.Contains("Test"))).ToList();
            result3.Should().HaveCount(1);
            result3[0].Id.Should().Be(1);
        }
    }

I used

AlBannaTechno avatar Dec 24 '20 16:12 AlBannaTechno

@AlBannaTechno Thanks again for responding! I think you have solved the issue! Your test script produces the correct results. Notice that is has a key difference from my original code.

Your two select statements are as follows:

var result2 = parentCollection.Include(p => p.Children).Query().Where(p => p.Children.Select(c => c.Name).Any(n => n.Contains("rd"))).ToList();
var result3 = parentCollection.Include(p => p.Children).Query().Where(p => p.Children.Select(c => c.Name).Any(n => n.Contains("Test"))).ToList();

Mine were as follows:

var result2 = parentCollection.Include(p => p.Children).Query().Where(p => p.Children.Where(c => c.Name.Contains("rd")).Any()).ToList();
var result3 = parentCollection.Include(p => p.Children).Query().Where(p => p.Children.Where(c => c.Name.Contains("Test")).Any()).ToList();

Yours produces the correct results, while mine always gives the same (first) result for every query, and continues to fail the test.

It looks like the key is to manually select every property that you want to filter on, rather than trying to dig into multiple layers (the c.Name.Contains is what hurt me, apparently).

I should be able to use your solution in my project, so thank you very much!


I think I would still like to keep this issue open, since something that seems like it should work does not. Either it should be unsupported syntax, or it should be fixed. Note that there is already precedence for unsupported syntax. For example, if I change my query to combine the Where and the Any, like this:

result2 = parentCollection.Include(p => p.Children).Query().Where(p => p.Children.Any(c => c.Name.Contains("rd"))).ToList();

I get the following exception from LiteDB.

System.NotSupportedException : Any/All requires simple parameter on left side. Eg: `x.Customers.Select(c => c.Name).Any(n => n.StartsWith('J'))`

So perhaps they could use a similar exception for my scenario, if it truly is not supported. Do you agree?

Thanks again for all your help!

micahmo avatar Dec 24 '20 17:12 micahmo

@micahmo

I should be able to use your solution in my project, so thank you very much!

I am so happy to hear that 😄 ,

just notice, LiteDB does not support everything like LINQ, so if you tried to optimize your query you will get a runtime exception for example you can not replace

p => p.Children.Where(c => c.Name.Contains("rd")).Any()

With

p.Children.Any(c => c.Name.Contains("rd"))

For C#-LINQ it is 100% valid, but for LITE-DB it is not, so this what I am talking about, there is nothing like this mentioned in the Docs, but you can see it in the Unit Tests,


I thought this issues should close, but I tested the issue again, and yah, this is a bug So this issue should not be closed

So we need for a core-team to look at this

@mbdavid @lbnascimento : please take a look at this issue, it seems to be a bug with caching

next is the test to reproduce the issue

  public class Issue1897
    {
        class Parent
        {
            [BsonId]
            public int Id { get; set; }

            [BsonRef("child")]
            public List<Child> Children { get; set; }
        }

        class Child
        {
            [BsonId]
            public int Id { get; set; }

            public string Name { get; set; }
        }

        // passed
        [Fact]
        public void ShouldResolveParentsBasedOnReferencedChildrenCriteriaForOneTime()
        {
            LiteDatabase database = new LiteDatabase(new MemoryStream());
            ILiteCollection<Parent> parentCollection = database.GetCollection<Parent>("parent");
            ILiteCollection<Child> childCollection = database.GetCollection<Child>("child");

            Child firstChild = new Child { Name = "TestA" };
            Child secondChild = new Child { Name = "TestB" };
            Child thirdChild = new Child { Name = "Third" };

            childCollection.Insert(firstChild);
            childCollection.Insert(secondChild);
            childCollection.Insert(thirdChild);

            Parent sterile = new Parent {Id = 1};
            Parent firstParent = new Parent { Id = 2, Children = new List<Child> { firstChild, secondChild } };
            Parent secondParent = new Parent { Id = 3, Children = new List<Child> { thirdChild } };

            
            parentCollection.Insert(sterile);
            parentCollection.Insert(firstParent);
            parentCollection.Insert(secondParent);

            #region 2

            var query2 = parentCollection.Include(p => p.Children).Query()
                .Where(p => p.Children.Where(c => c.Name.Contains("Test")).Any());

            var result2 = query2.ToList();
          
            result2.Should().HaveCount(1);
            result2[0].Id.Should().Be(2);

            #endregion
          
        }
        
        // #Bug : Executing Two Query Prevent Getting The Result From The Last Query
        // failed
        [Fact]
        public void ShouldResolveParentsBasedOnReferencedChildrenCriteriaForMultipleTimes()
        {
            LiteDatabase database = new LiteDatabase(new MemoryStream());
            ILiteCollection<Parent> parentCollection = database.GetCollection<Parent>("parent");
            ILiteCollection<Child> childCollection = database.GetCollection<Child>("child");

            Child firstChild = new Child { Name = "TestA" };
            Child secondChild = new Child { Name = "TestB" };
            Child thirdChild = new Child { Name = "Third" };

            childCollection.Insert(firstChild);
            childCollection.Insert(secondChild);
            childCollection.Insert(thirdChild);

            Parent sterile = new Parent {Id = 1};
            Parent firstParent = new Parent { Id = 2, Children = new List<Child> { firstChild, secondChild } };
            Parent secondParent = new Parent { Id = 3, Children = new List<Child> { thirdChild } };

            
            parentCollection.Insert(sterile);
            parentCollection.Insert(firstParent);
            parentCollection.Insert(secondParent);

            #region 1 : Just Comment This Region And Everything Will Works

            var query1 = parentCollection.Include(p => p.Children).Query()
                .Where(p => p.Children.Any(c => c.Name.Contains("rd")));
            
            // ["COUNT(FILTER($.Children[*]=>@.Name LIKE (\"%\"+@p0+\"%\")))>0"]
            var planDocument2 = query1.GetPlan();
            var result1 = query1.ToList();
            
            result1.Should().HaveCount(1);
            result1[0].Id.Should().Be(3);

            #endregion

            
            #region 2

            var query3 = parentCollection.Include(p => p.Children).Query()
                .Where(p => p.Children.Where(c => c.Name.Contains("Test")).Any());

            // If result2: executed                 ["COUNT(FILTER($.Children[*]=>@.Name LIKE (\"%\"+@p0+\"%\")))>0"]
            // If result2: does not executed :      ["COUNT(FILTER($.Children[*]=>@.Name LIKE (\"%\"+@p0+\"%\")))>0"]
            var planDocument3 = query3.GetPlan();

            var result3 = query3.ToList();
          
            result3.Should().HaveCount(1);
            result3[0].Id.Should().Be(2, because: "Executing Any Query Should Not Affect The Next Query");

            #endregion
         
        }
}

AlBannaTechno avatar Dec 24 '20 18:12 AlBannaTechno

LiteDB does not support everything like LINQ

Yes, I am discovering that! However, as you found, the syntax I tried to use seemed to be supported. I even inspected the ILiteQueryable and I could see that the query was valid/correct. As you said, it seems like there may be a caching issue. I'm glad you could reproduce! Thanks for tagging the core team, and thanks again for your help.

micahmo avatar Dec 24 '20 18:12 micahmo

@AlBannaTechno Thanks again for responding! I think you have solved the issue! Your test script produces the correct results. Notice that is has a key difference from my original code.

Your two select statements are as follows:

var result2 = parentCollection.Include(p => p.Children).Query().Where(p => p.Children.Select(c => c.Name).Any(n => n.Contains("rd"))).ToList();
var result3 = parentCollection.Include(p => p.Children).Query().Where(p => p.Children.Select(c => c.Name).Any(n => n.Contains("Test"))).ToList();

Mine were as follows:

var result2 = parentCollection.Include(p => p.Children).Query().Where(p => p.Children.Where(c => c.Name.Contains("rd")).Any()).ToList();
var result3 = parentCollection.Include(p => p.Children).Query().Where(p => p.Children.Where(c => c.Name.Contains("Test")).Any()).ToList();

Yours produces the correct results, while mine always gives the same (first) result for every query, and continues to fail the test.

It looks like the key is to manually select every property that you want to filter on, rather than trying to dig into multiple layers (the c.Name.Contains is what hurt me, apparently).

I should be able to use your solution in my project, so thank you very much!

I think I would still like to keep this issue open, since something that seems like it should work does not. Either it should be unsupported syntax, or it should be fixed. Note that there is already precedence for unsupported syntax. For example, if I change my query to combine the Where and the Any, like this:

result2 = parentCollection.Include(p => p.Children).Query().Where(p => p.Children.Any(c => c.Name.Contains("rd"))).ToList();

I get the following exception from LiteDB.

System.NotSupportedException : Any/All requires simple parameter on left side. Eg: `x.Customers.Select(c => c.Name).Any(n => n.StartsWith('J'))`

So perhaps they could use a similar exception for my scenario, if it truly is not supported. Do you agree?

Thanks again for all your help!

Just adding my own experience here. This really does seem to be a bug but your solution of selecting the properties (in my case two of them) in the child object and then performing an Any on them seems to have done the trick. I suspect this might have something to do with the fact that normally you'd just call .Any() with the expression you want but as was pointed out earlier in this thread that throws a not supported exception so calling .Where() with the expression and then .Any() on that seemed to be a sneaky work around but ultimately one that bit us for trying to be too clever.

RobbieLD avatar Nov 05 '22 12:11 RobbieLD

hi, I have similar problem, I have same query, that works in studio, however after upgrade from v4 to v5, I get these exception when executing csharp variant (even without null check).

When I execute sql version and deserialize it works. What could be missing? litedb 5.0.17 .net 8.0.1

ILiteCollection<Post> col = repo.db.GetCollection<Post>().Include(p => p.Categories).Include(p=>p.Tags);
var query = col.Query().Where(p => p.Categories!=null &&
                                                     p.Categories.Select(c => c.ID).Any(x=>x ==7)).ToList(); //this line fails

      An unhandled exception has occurred while executing the request.
      System.Reflection.TargetException: Object does not match target type.
         at System.Reflection.MethodInvokerCommon.ValidateInvokeTarget(Object target, MethodBase method)
         at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
         at System.Reflection.PropertyInfo.GetValue(Object obj)
         at LiteDB.LinqExpressionVisitor.VisitConstant(ConstantExpression node)        
         at LiteDB.LinqExpressionVisitor.VisitMember(MemberExpression node)
         at LiteDB.LinqExpressionVisitor.VisitMethodCall(MethodCallExpression node)    
         at LiteDB.LinqExpressionVisitor.VisitMember(MemberExpression node)
         at LiteDB.LinqExpressionVisitor.VisitAsPredicate(Expression expr, Boolean ensurePredicate)
         at LiteDB.LinqExpressionVisitor.VisitEnumerablePredicate(LambdaExpression lambda)
         at LiteDB.LinqExpressionVisitor.ResolvePattern(String pattern, Expression obj, IList`1 args)
         at LiteDB.LinqExpressionVisitor.VisitMethodCall(MethodCallExpression node)    
         at LiteDB.LinqExpressionVisitor.VisitAsPredicate(Expression expr, Boolean ensurePredicate)
         at LiteDB.LinqExpressionVisitor.VisitBinary(BinaryExpression node)
         at System.Linq.Expressions.ExpressionVisitor.VisitLambda[T](Expression`1 node)         at LiteDB.LinqExpressionVisitor.VisitLambda[T](Expression`1 node)
         at LiteDB.LinqExpressionVisitor.Resolve(Boolean predicate)
         at LiteDB.BsonMapper.GetExpression[T,K](Expression`1 predicate)
         at LiteDB.LiteQueryable`1.Where(Expression`1 predicate)
 at .. method ..line 177

Jacknq avatar Feb 06 '24 15:02 Jacknq