apollo-datasource-mongodb icon indicating copy to clipboard operation
apollo-datasource-mongodb copied to clipboard

Two queries by nested fields return duplicated result

Open nomshar opened this issue 4 years ago • 6 comments

So there is a document { "top-field": "abc", "nested-fields": { "field1": "value", "field2": "" } }

I'm sending two queries: const itemsA = this.findByFields({ "nested-fields.field1": "value"}); const itemsB = this.findByFields({ "nested-fields.field2": "value"});

itemsB contains data, but shouldn't.

The issue, as I realized, comes from cache.js orderDocs and getNestedValue method. It processes correctly filters like { nested-fields: { field1: "value" }}, but cannot process concatenated fields.

nomshar avatar Sep 09 '21 23:09 nomshar

Thanks for reporting! What version? This may have been fixed in 0.5.2

lorensr avatar Sep 10 '21 17:09 lorensr

Thanks for reporting! What version? This may have been fixed in 0.5.2

"_id": "[email protected]"

nomshar avatar Sep 10 '21 21:09 nomshar

I'm unable to reproduce—I added a passing test case that I think does what you said?

https://github.com/GraphQLGuide/apollo-datasource-mongodb/blob/08e1c076739867a9e73a707adc66767a7e8525d7/src/tests/datasource.test.js#L132-L136

If you could modify/add a test case that demonstrates the bug, that would be a big help ☺️

lorensr avatar Sep 13 '21 04:09 lorensr

Hello, I believe we hit the same problem as described above where I work, but I was now able to reproduce it (against version 0.5.4) , write some rough test cases for it and also have a potential fix.

Please check diff with my forked version here: https://github.com/GraphQLGuide/apollo-datasource-mongodb/compare/master...potpe:apollo-datasource-mongodb:filtering-with-nested-fields

The issue only happens when underlying DataLoader library batches multiple invocations of findByFields method together and when fields provided are nested fields.

Consider following documents to be present in MongoDb:

      {
        name: 'Bob',  
        nested: { field1: 'value1', field2: 'value1' }
      },
      {
        name: 'Charlie',
        nested: { field1: 'value2', field2: 'value2' }
      },
      {
        name: 'Dan',
        nested: { field1: 'value1', field2: 'value2' }
      },

findByFields invocations as follows:

findByFields({ 'nested.field1': 'value1', 'nested.field2': 'value1' })
findByFields({ 'nested.field1': 'value2', 'nested.field2': 'value2' })
findByFields({ 'nested.field1': 'value3', 'nested.field2': 'value3' })

These get optimized into a single Mongo query like this:

filter:  {
      'nested.field1': { '$in': [ 'value1', 'value2', 'value3' ] },
      'nested.field2': { '$in': [ 'value1', 'value2', 'value3' ] }
}

Results from Mongo are returned and are matching all documents at this stage:

results:  [
      {
        name: 'Bob',
        nested: {  field1: 'value1', field2: 'value1'}
      },
      {
        name: 'Charlie',
        nested: { field1: 'value2', field2: 'value2' }
      },
      {
        name: 'Dan',
        nested: { field1: 'value1', field2: 'value2' }
      }
    ]

orderDocs method attempts to do mapping between docs from results above, to fieldsArray, which is an array with original fields passed into findByFields method:

fieldsArray [
      { 'nested.field1': [ 'value1' ], 'nested.field2': [ 'value1' ] },
      { 'nested.field1': [ 'value2' ], 'nested.field2': [ 'value2' ] },
      { 'nested.field1': [ 'value3' ], 'nested.field2': [ 'value3' ] }
    ]

2 problems happen in orderDocs function:

a) call to getNestedValues method fails to extract 'value1' from object { 'nested.field1': [ 'value1' ], 'nested.field2': [ 'value1' ] }, given fieldName of nested.field1. Instead undefined is returned.

Due to this, all docs from the result are attached to all 3 items in fieldsArray, effectively causing duplication.

orderedDocs:  [
      [
        { name: 'Bob', nested: [Object] },
        { name: 'Charlie', nested: [Object] },
        {  name: 'Dan', nested: [Object] }
      ],
      [
        { name: 'Bob', nested: [Object] },
        { name: 'Charlie', nested: [Object] },
        {  name: 'Dan', nested: [Object] }
      ],
      [
        { name: 'Bob', nested: [Object] },
        { name: 'Charlie', nested: [Object] },
        {  name: 'Dan', nested: [Object] }
      ],
    ]

b) even if the above call returned correct value, call to const docValue = doc[fieldName]; also fails to retrieve value from the doc, when fieldName is nested.field1. This means that some of the results would not filtered out correctly.

Branch in my forked version is trying to fix these 2 issues.

Expected content of orderedDocs would be as follows:

orderedDocs:  [
      [ 
         { name: 'Bob', nested: { field1: 'value1', field2: 'value1' } }  --matches filter { 'nested.field1': [ 'value1' ], 'nested.field2': [ 'value1' ] },
      ],
      [ 
          { name: 'Charlie', nested: { field1: 'value2', field2: 'value2' } } --matches filter{ 'nested.field1': [ 'value2' ], 'nested.field2': [ 'value2' ] },
      ],
      [
     -- no document should match filter { 'nested.field1': [ 'value3' ], 'nested.field2': [ 'value3' ] },
      ]
    ]

Hopefully the above makes sense, but let me know if anything needed any further clarification.

potpe avatar Oct 10 '22 18:10 potpe

Hi @lorensr !

Could you please look at the above findings and the proposed fix? If the fix made sense to you, I could create a pull request..

Thank you!

potpe avatar Oct 24 '22 15:10 potpe

A PR would be great, thanks!

lorensr avatar Oct 28 '22 23:10 lorensr