database icon indicating copy to clipboard operation
database copied to clipboard

SPARQL Update produces "BigdataValue not available" exception on integers

Open smalyshev opened this issue 5 years ago • 20 comments

This SPARQL statement:

prefix mediawiki: <https://www.mediawiki.org/ontology#>
# Changes
DELETE {
?category ?x ?y
} INSERT {

<https://es.wikipedia.org/wiki/foo1> a mediawiki:Category ;
        mediawiki:subcategories 0 .

} WHERE {
   VALUES ?category {
     <https://es.wikipedia.org/wiki/foo2>
   }
};
# Changes
DELETE {
?category ?x ?y
} INSERT {

<https://es.wikipedia.org/wiki/foo3> a mediawiki:Category ;
        mediawiki:subcategories 0 .

} WHERE {
   VALUES ?category {
     <https://es.wikipedia.org/wiki/foo4>
   }
};

Produces an exception:

Caused by: java.lang.AssertionError: BigdataValue not available: ConstantNode(XSDInteger(0)), term.iv=XSDInteger(0)
        at com.bigdata.rdf.sparql.ast.eval.ASTConstructIterator.getValue(ASTConstructIterator.java:905)
        at com.bigdata.rdf.sparql.ast.eval.ASTConstructIterator.makeStatement(ASTConstructIterator.java:831)
        at com.bigdata.rdf.sparql.ast.eval.ASTConstructIterator.<init>(ASTConstructIterator.java:268)
        at com.bigdata.rdf.sparql.ast.eval.AST2BOpUpdate.convertDeleteInsert(AST2BOpUpdate.java:918)
        at com.bigdata.rdf.sparql.ast.eval.AST2BOpUpdate.convertUpdateSwitch(AST2BOpUpdate.java:443)
        at com.bigdata.rdf.sparql.ast.eval.AST2BOpUpdate.convertUpdate(AST2BOpUpdate.java:293)
        ... 10 more

Looks like some bug in DELETE/INSERT parsing integers?

smalyshev avatar Aug 03 '18 20:08 smalyshev

Note that this only happens if both statements have the same integer value. If the statements have different values, everything works fine.

smalyshev avatar Aug 03 '18 20:08 smalyshev

Interestingly enough, this only happens with DELETE/INSERT clauses, if written as separate DELETE and INSERT clause, the bug does not seem to happen.

smalyshev avatar Aug 03 '18 21:08 smalyshev

Some investigation on this. First (global update) resolution creates proper values for both 0 constants. After the first update has run, the ASTDeferredIVResolution.resolveUpdate is executed again, to update new values, and then it goes into this code in ASTDeferredIVResolution:

    private void defer(final BigdataValue value, final Handler handler) {
        if (value == null)
            return;
        if (value.getValueFactory() == vf && value.isRealIV()) {
            /*
             * We have a BigdataValue that belongs to the correct namespace and
             * which has already been resolved to a real IV.
             */
            if (value.getIV().needsMaterialization()) {
                value.getIV().setValue(value);
            }
            handler.handle(value.getIV());
            return;
        }

For literals, needsMaterialization() is always false, so handler.handle is getting called on value.getIV(), and IV in this case has cache=null. So when handler.handle creates a new constant from this IV (not sure why, it should be completely fine with the old one) this constant inherits cache=null. But since then it is considered resolved, cache=null stays until the time comes to make use of it in ASTConstructIterator.makeStatement, and then getValue() fails because iv.cache is null so iv.hasValue is false and getValue() gets null.

Not sure yet why the wrong IV gets there, the first materialization cycle (before the update) places the correct IV there, but by the time it gets to the second loop, the wrong one is there.

smalyshev avatar Aug 03 '18 23:08 smalyshev

Looks like the IV is reset on context.conn.flush() which is eventually calling LexiconRelation.addTerms on all terms mentioned in the update, getInlineIV(v) in addTerms seems to reset the iv. I am not sure this is the intended effect - it looks a bit strange to modify values in that place if there's already IV inside, but the comment mentions it, so maybe it's OK. But in this case, shouldn't it also do: iv.setValue(value) in createInlineIV()?

smalyshev avatar Aug 04 '18 00:08 smalyshev

Stas,

There have occasionally been some bugs related to the BigdataRDFValue <=> IV mutual references. Copying Mike & Michael who might recognize this off hand.

Mike & Michael - note that there are several messages which follow in this thread.

Bryan

On Fri, Aug 3, 2018 at 1:43 PM, Stanislav Malyshev <[email protected]

wrote:

This SPARQL statement:

prefix mediawiki: https://www.mediawiki.org/ontology#

Changes

DELETE { ?category ?x ?y } INSERT {

https://es.wikipedia.org/wiki/foo1 a mediawiki:Category ; mediawiki:subcategories 0 .

} WHERE { VALUES ?category { https://es.wikipedia.org/wiki/foo2 } };

Changes

DELETE { ?category ?x ?y } INSERT {

https://es.wikipedia.org/wiki/foo3 a mediawiki:Category ; mediawiki:subcategories 0 .

} WHERE { VALUES ?category { https://es.wikipedia.org/wiki/foo4 } } };

Produces an exception:

Caused by: java.lang.AssertionError: BigdataValue not available: ConstantNode(XSDInteger(0)), term.iv=XSDInteger(0) at com.bigdata.rdf.sparql.ast.eval.ASTConstructIterator.getValue(ASTConstructIterator.java:905) at com.bigdata.rdf.sparql.ast.eval.ASTConstructIterator.makeStatement(ASTConstructIterator.java:831) at com.bigdata.rdf.sparql.ast.eval.ASTConstructIterator.(ASTConstructIterator.java:268) at com.bigdata.rdf.sparql.ast.eval.AST2BOpUpdate.convertDeleteInsert(AST2BOpUpdate.java:918) at com.bigdata.rdf.sparql.ast.eval.AST2BOpUpdate.convertUpdateSwitch(AST2BOpUpdate.java:443) at com.bigdata.rdf.sparql.ast.eval.AST2BOpUpdate.convertUpdate(AST2BOpUpdate.java:293) ... 10 more

Looks like some bug in DELETE/INSERT parsing integers?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/blazegraph/database/issues/100, or mute the thread https://github.com/notifications/unsubscribe-auth/ACdv4Lvp3NyAy4NpmxQjgpMzF3fM51Gyks5uNLYHgaJpZM4VulwH .

thompsonbry avatar Aug 04 '18 16:08 thompsonbry

Stas,

I will discuss this with Michael. The general pattern of the ASTDeferredIVResolution is that fully inline values get used first to provide a faithful capture of the lexcial forms. Then we go through a resolution process. If dictionary forms are discovered, they are then cached.

After each operation in a SPARQL UPDATE request, we go through a re-discovery process in case any new items have been added to the dictionary. This is in AST2BOpUpdate - see https://github.com/blazegraph/database/blob/master/bigdata-core/bigdata-rdf/src/java/com/bigdata/rdf/sparql/ast/eval/AST2BOpUpdate.java#L281, which is where this happens.

There might be something special about the DELETE/INSERT/WHERE clause handling. However, since DELETE is processed before INSERT, I can not see how new terms could become defined.

Can you replicate the bug using DELETE DATA; INSERT DATA in a single SPARQL UPDATE request? If so, this would suggest the problem is restricted to the ASTDeferredIVResolution logic.

Copying Igor who originally wrote the ASTDeferredIVResolution code. He might have some insight to offer.

Thanks, Bryan

On Fri, Aug 3, 2018 at 4:35 PM, Stanislav Malyshev <[email protected]

wrote:

Some investigation on this. First (global update) resolution creates proper values for both 0 constants. After the first update has run, the ASTDeferredIVResolution.resolveUpdate is executed again, to update new values, and then it goes into this code in ASTDeferredIVResolution:

private void defer(final BigdataValue value, final Handler handler) {
    if (value == null)
        return;
    if (value.getValueFactory() == vf && value.isRealIV()) {
        /*
         * We have a BigdataValue that belongs to the correct namespace and
         * which has already been resolved to a real IV.
         */
        if (value.getIV().needsMaterialization()) {
            value.getIV().setValue(value);
        }
        handler.handle(value.getIV());
        return;
    }

For literals, needsMaterialization() is always false, so handler.handle is getting called on value.getIV(), and IV in this case has cache=null. So when handler.handle creates a new constant from this IV (not sure why, it should be completely fine with the old one) this constant inherits cache=null. But since then it is considered resolved, cache=null stays until the time comes to make use of it in ASTConstructIterator.makeStatement, and then getValue() fails because iv.cache is null so iv.hasValue is false and getValue() gets null.

Not sure yet why the wrong IV gets there, the first materialization cycle (before the update) places the correct IV there, but by the time it gets to the second loop, the wrong one is there.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/blazegraph/database/issues/100#issuecomment-410401730, or mute the thread https://github.com/notifications/unsubscribe-auth/ACdv4BI2v9kTDK4fR9rxspVTGLSoyrwuks5uNN42gaJpZM4VulwH .

thompsonbry avatar Aug 04 '18 16:08 thompsonbry

The issue does not reproduce with DELETE DATA; INSERT DATA in a single SPARQL, but reproduces with several INSERT statements in a single update, adding DELETE statements or DELETE clauses to the DELETE+INSERT statements are not important to reproduction of the problem. The reason to reset IV in some code paths might be the need to ensure that these IVs will be resolved against proper namespaces. I'm looking into this.

igor-kim avatar Aug 04 '18 17:08 igor-kim

The issue arises on executing any INSERT (or DELETE+INSERT) clause starting with second statement, which use the literal constant from the first statement. It happens due to after execution of the first statement the Constant is assigned with XSDIntegerIV, which has cached BigdataValue, which is assigned with the other instance of XSDIntegerIV (probably to avoid circular reference, or it might be actually IV resolved against the store in some cases), which does not have cached BigdataValue assigned.

Potential resolution might be to assing cached BigdataValue for IV as Stas suggested, this might be fixed in com.bigdata.rdf.sparql.ast.eval.ASTDeferredIVResolution.fillInIV(AbstractTripleStore, BOp) which handling ConstantNode BOp.

 if (bop instanceof ConstantNode) {
            final BigdataValue value = ((ConstantNode) bop).getValue();
            if (value != null) {
                /*
                 * Even if iv is already filled in we should try to resolve it
                 * against triplestore, as previously resolved IV may be
                 * inlined, but expected to be term from lexicon relation on
                 * evaluation.
                 */
                defer(value, new Handler(){
                    @Override
                    public void handle(final IV newIV) {
                    	newIV.setValue(value);  // Fix for Git Issue 100
                        ((ConstantNode) bop).setArg(0, new Constant(newIV));
                    }
                });
            }
            return;
        }

I'm also attaching the test case which reproduces the original issue and it runs properly with the above fix. But this must be put though all the test suites to ensure this does not break anything due to (potentially) circular reference. Test_Ticket_Git100.java.txt

igor-kim avatar Aug 04 '18 19:08 igor-kim

Circular reference per se might not be a big problem since when I look in the debugger on the values in the first pass (before the bug happens), the circular reference is there - the value refers to the iv, and iv refers back to the same value. It looks like resetting the IV happens in context.conn.flush() and may be unintended effect as the code doesn't seem to rely on it being reset to a different value from before. I might misunderstand deeper aspects of it of course.

smalyshev avatar Aug 05 '18 06:08 smalyshev

I tried it with the fix above and I am getting this error when running WDQS test suite:

Caused by: java.lang.RuntimeException: Value does not belong to this namespace: value="+0000-03-13T00:00:00Z" at com.bigdata.rdf.lexicon.LexiconRelation.addTerms(LexiconRelation.java:1770) at com.bigdata.bop.rdf.join.MockTermResolverOp.handleChunk(MockTermResolverOp.java:376) at com.bigdata.bop.rdf.join.MockTermResolverOp.access$000(MockTermResolverOp.java:74) at com.bigdata.bop.rdf.join.MockTermResolverOp$ChunkTask.call(MockTermResolverOp.java:221) at com.bigdata.bop.rdf.join.MockTermResolverOp$ChunkTask.call(MockTermResolverOp.java:175)

Looks like there's some issue with this way to fix it...

smalyshev avatar Aug 06 '18 23:08 smalyshev

Any other ideas on fixing this? I am not sure I understand enough of how namespaces/IVs work to propose anything on my own.

smalyshev avatar Aug 27 '18 18:08 smalyshev

Michael, did you get a chance to look at this yet?

Bryan

On Mon, Aug 27, 2018, 11:49 Stanislav Malyshev [email protected] wrote:

Any other ideas on fixing this? I am not sure I understand enough of how namespaces/IVs work to propose anything on my own.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/blazegraph/database/issues/100#issuecomment-416328468, or mute the thread https://github.com/notifications/unsubscribe-auth/ACdv4NyEPc3AnEQD49iZSvBiw6uil9J-ks5uVD9PgaJpZM4VulwH .

thompsonbry avatar Aug 27 '18 21:08 thompsonbry

ping?

smalyshev avatar Sep 21 '18 22:09 smalyshev

Stas, do you have some time to talk about this Saturday am? Maybe a google hangout? Bryan

On Fri, Sep 21, 2018 at 15:52 Stanislav Malyshev [email protected] wrote:

ping?

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/blazegraph/database/issues/100#issuecomment-423691128, or mute the thread https://github.com/notifications/unsubscribe-auth/ACdv4HYScbohCYlQPX9LnoDfuCtdCOhvks5udW2jgaJpZM4VulwH .

thompsonbry avatar Sep 22 '18 01:09 thompsonbry

I tried this patch from Bryan:

--- a/bigdata-core/bigdata-rdf/src/java/com/bigdata/rdf/lexicon/LexiconRelation.java
+++ b/bigdata-core/bigdata-rdf/src/java/com/bigdata/rdf/lexicon/LexiconRelation.java
@@ -3351,7 +3351,12 @@ public class LexiconRelation extends AbstractRelation<BigdataValue>
      */
     @SuppressWarnings("rawtypes")
     final public IV getInlineIV(final Value value) {
-        
+        if (value instanceof BigdataValue) {
+            BigdataValue bv = (BigdataValue)value;
+            if(bv.isRealIV()) {
+                return bv.getIV();
+            }
+        }
         return getLexiconConfiguration().createInlineIV(value);
 
     }

and the problem is gone and I see no failures so far. That might be a solution for the issue?

smalyshev avatar Sep 25 '18 06:09 smalyshev

Stas,

Great.

Did you also try the build from the head of the release branch? It looked like you were using old code.

Bryan

On Mon, Sep 24, 2018 at 11:42 PM Stanislav Malyshev < [email protected]> wrote:

I tried this patch from Bryan:

--- a/bigdata-core/bigdata-rdf/src/java/com/bigdata/rdf/lexicon/LexiconRelation.java +++ b/bigdata-core/bigdata-rdf/src/java/com/bigdata/rdf/lexicon/LexiconRelation.java @@ -3351,7 +3351,12 @@ public class LexiconRelation extends AbstractRelation<BigdataValue> */ @SuppressWarnings("rawtypes") final public IV getInlineIV(final Value value) {

  •    if (value instanceof BigdataValue) {
    
  •        BigdataValue bv = (BigdataValue)value;
    
  •        if(bv.isRealIV()) {
    
  •            return bv.getIV();
    
  •        }
    
  •    }
       return getLexiconConfiguration().createInlineIV(value);
    
    }

and the problem is gone and I see no failures so far. That might be a solution for the issue?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/blazegraph/database/issues/100#issuecomment-424225402, or mute the thread https://github.com/notifications/unsubscribe-auth/ACdv4HaYys_-9xK-3PbudpB-RwTMC5Ksks5uedBxgaJpZM4VulwH .

thompsonbry avatar Sep 25 '18 14:09 thompsonbry

Hi!

Great.

Did you also try the build from the head of the release branch? It looked like you were using old code.

Yeah I figured that out. The build was actually fine but Eclipse GUI picked up wrong files for display. I verified it on correct build from 2.1.5RC. Didn't run the whole blazegraph test suite on that though.

-- Stas Malyshev [email protected]

smalyshev avatar Sep 25 '18 19:09 smalyshev

Ok. So, do you have a full test suite pass with that change?

Bryan

On Tue, Sep 25, 2018, 12:52 Stanislav Malyshev [email protected] wrote:

Hi!

Great.

Did you also try the build from the head of the release branch? It looked like you were using old code.

Yeah I figured that out. The build was actually fine but Eclipse GUI picked up wrong files for display. I verified it on correct build from 2.1.5RC. Didn't run the whole blazegraph test suite on that though.

-- Stas Malyshev [email protected]

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/blazegraph/database/issues/100#issuecomment-424478632, or mute the thread https://github.com/notifications/unsubscribe-auth/ACdv4MPn_b1w5eeIxRUKTTKD_cNJpfhoks5ueolygaJpZM4VulwH .

thompsonbry avatar Sep 25 '18 20:09 thompsonbry

See also: https://jira.blazegraph.com/browse/BLZG-9157

smalyshev avatar Mar 06 '19 22:03 smalyshev

This issue still occurs, has there been any fix for this in newer versions? Currently still running version 2.1.4. I noticed that the PRs for this that were referneced in JIRA were closed but not merged.

shawnhind avatar May 12 '20 16:05 shawnhind