virtuoso-opensource icon indicating copy to clipboard operation
virtuoso-opensource copied to clipboard

The length of generated SQL text has exceeded 10000 lines of code + transactions

Open rapw3k opened this issue 4 years ago • 3 comments

Hi

Our python application (using rdflib) produces multiple triples, which should be loaded into virtuoso.

The triples are grouped in graphs, and thus each graph can have multiple triples.

One transaction is composed of several graphs, hence, if all the triples were loaded successfully the transaction is a success, otherwise, we should rollback the loaded triples.

  1. First issue is related to loading the triples.

    We try to load triples multiple triples at once (using the addN method), instead of one by one which is quite inefficient, but virtuoso throws the error below. This error is misleading though, because the 10000 refers to characters, not lines or triples, and thus it is very small limit. In this situation, only if the set of triples have in total less than 10000 characters the operation will succeed. So, is there a better way to do this ?

    10:40:40 ERRS_0 37000 SQ074 Line 6921: SP031: SPARQL: Internal error: The length of generated SQL text has exceeded 10000 lines of code

  2. we plan to test the graph protocol endpoint, which seems a good alternative, but we wonder if there is a better choice. Additionally, while testing the examples in http://vos.openlinksw.com/owiki/wiki/VOS/VirtGraphProtocolCURLExamples the curl did not work properly. We get the error below. See command line below too. Is there something wrong with the curl command? If we use the web UI it works fine...

    curl --digest --user user1:pass1 --verbose --url "http://virtuoso-sparql-endpoint-rohub.apps.paas-dev.psnc.pl/sparql-graph-crud-auth?graph-uri=https://w3id.org/rohub2020/test/b0863a5f-7c59-4baf-bf61-b29467c99866yyy123" -T ./book1.ttl

    The PUT request for graph <https://w3id.org/rohub2020/test/b0863a5f-7c59-4baf-bf61-b29467c99866yyy123> is rejected: the submitted resource is of unsupported type text/plain

  3. We could not find a way to manage transactions easily in Python with rdflib, we tried using the option of auto_commit to off and then make the commit after the whole transaction is finished, but we got other error, that transaction was too long too, so we had to make an internal account of the triples and remove if them if there was an error during transaction, but is there another way of handling transactions ?

rapw3k avatar Nov 12 '20 17:11 rapw3k

The "SP031: SPARQL: Internal error: The length of generated SQL text has exceeded 10000 lines of code" means that when the SPARQL INSERT queries are transformed to SQL for execution in Virtuoso, they are too large and exceeding a Virtuoso limit of 10000 for the size of a SQL query, and thus need to be reduced in size.

More fundamentally, this is an issue of Transaction Semantics Fidelity, i.e., how much control do you have over transactions? By default, all SPARUL statements are committed following execution, or all are rolled back. A Virtuoso SPARQL Endpoint provides basic autocommit transaction functionality.

You need a semantically rich transaction layer when you want more specific conditional behavior provided by isolation levels, transaction blocks, and rollbacks or commits. These are supported and available in Virtuoso via one of the SQL Data Access interfaces, i.e., ODBC, JDBC, ADO.Net, Jena, Sesame, etc.

See:

HughWilliams avatar Nov 14 '20 16:11 HughWilliams

Thank you for your input @HughWilliams. What ini setting defines the "Virtuoso limit of 10000 for the size of a SQL query"?

MPParsley avatar Jan 09 '23 12:01 MPParsley

There is no INI file setting for controlling the 10000 line count for a SQL query.

This is a setting in the source code #define SSG_MAX_ALLOWED_LINE_COUNT 10000 in the libsrc/Wi/sparql2sql.h file (line 1022), but this has not been tested, thus it is not recommended changing this setting as it may have side effects, and could either run out of MaxMemPoolSize or the generated query does not run in any reasonable time limit or cause other issues ...

HughWilliams avatar Jan 09 '23 13:01 HughWilliams