rdf4j
rdf4j copied to clipboard
Transactions delete statements when Adding/Removing Graphs
Current Behavior
I have a graph in a repository where it only contains statements where the subject is the same as the graph IRI(occurs in both Memory and Native):
<https://mobi.com/records#someRecord> {
<https://mobi.com/records#someRecord> <http://purl.org/dc/terms/title> "asdf";
<http://purl.org/dc/terms/issued> "2022-04-11T13:11:15.855-06:00"^^<http://www.w3.org/2001/XMLSchema#dateTime>;
<http://purl.org/dc/terms/modified> "2022-04-11T13:11:15.88-06:00"^^<http://www.w3.org/2001/XMLSchema#dateTime>;
<http://mobi.com/ontologies/ontology-editor#ontologyIRI> <https://mobi.com/ontologies/Asdf> .
}
- Start a transaction
- Remove the graph
- Load in an updated graph with a change of
<https://mobi.com/records#someRecord> <http://mobi.com/ontologies/ontology-editor#ontologyIRI> <https://mobi.com/ontologies/Qwerty>
- Do a
getStatements
with the subject IRI - Load results into a Model This combined with the transaction causes the issue
- Commit
- Retrieve the graph
Result:
- Mid transaction getStatement model has both deleted statement and added statement
- End Graph only contains the statement
<https://mobi.com/records#someRecord> <http://mobi.com/ontologies/ontology-editor#ontologyIRI> <https://mobi.com/ontologies/Qwerty>
Expected Behavior
- Graph should contain all added statements:
<https://mobi.com/records#someRecord> {
<https://mobi.com/records#someRecord> <http://purl.org/dc/terms/title> "asdf";
<http://purl.org/dc/terms/issued> "2022-04-11T13:11:15.855-06:00"^^<http://www.w3.org/2001/XMLSchema#dateTime>;
<http://purl.org/dc/terms/modified> "2022-04-11T13:11:15.88-06:00"^^<http://www.w3.org/2001/XMLSchema#dateTime>;
<http://mobi.com/ontologies/ontology-editor#ontologyIRI> <https://mobi.com/ontologies/Qwerty> .
}
- Retrieving statements from the graph within a transaction shouldn't affect the end state of the graph.
Steps To Reproduce
Clone branch bug/transaction_retrieval
https://github.com/daltontc/rdf4jTest/tree/bug/transaction_retrieval and run Main.
public class Main {
static ValueFactory vf = SimpleValueFactory.getInstance();
static IRI recordId = vf.createIRI("https://mobi.com/records#someRecord");
public static void main(String[] args) throws IOException {
File repoDir = new File("target/datadir/" + UUID.randomUUID());
NativeStore nativeStore = new NativeStore();
MemoryStore memoryStore = new MemoryStore();
Repository repo = new SailRepository(memoryStore);
InputStream stream = Main.class.getResourceAsStream("/record_def_original.trig");
try (RepositoryConnection conn = repo.getConnection()) {
conn.add(stream, RDFFormat.TRIG);
RepositoryResult<Statement> stmts = conn.getStatements(recordId, null, null);
Model model = QueryResults.asModel(stmts);
stmts.close();
System.out.println(model.size());
}
try (RepositoryConnection conn = repo.getConnection()) {
conn.begin(); // Occurs with any transaction level > NONE
conn.remove((Resource) null, null, null, recordId);
// Clear produces the same result
// conn.clear(recordId);
conn.add(Rio.parse(Main.class.getResourceAsStream("/record_def_change.trig"), RDFFormat.TRIG));
// Retrieval by graph provides expected result
// RepositoryResult<Statement> stmts = conn.getStatements(null, null, null, recordId);
RepositoryResult<Statement> stmts = conn.getStatements(recordId, null, null);
Model model = QueryResults.asModel(stmts);
System.out.println(model.size());
stmts.close();
conn.commit(); // Same behavior if moved below last retrieval
RepositoryResult<Statement> recordGraph = conn.getStatements(null, null, null, recordId);
Model resultFinal = QueryResults.asModel(recordGraph);
recordGraph.close();
System.out.println(resultFinal.size());
}
repoDir.delete();
}
}
Version
3.7.6
Are you interested in contributing a solution yourself?
No response
Anything else?
No response
What I am noticing is that it is something to do with the interaction of the:
- Transaction start
- Graph removal
- Same graph addition
- Queries against graph after addition (causes removal)
In the below case, any of the fields that I query for in the TupleQuery that should exist in the updated graph end up being removed from the repository.
public static void main(String[] args) throws IOException {
File repoDir = new File("target/datadir/" + UUID.randomUUID());
NativeStore nativeStore = new NativeStore();
MemoryStore memoryStore = new MemoryStore();
Repository repo = new SailRepository(memoryStore);
InputStream stream = Main.class.getResourceAsStream("/record_def_original.trig");
try (RepositoryConnection conn = repo.getConnection()) {
conn.add(stream, RDFFormat.TRIG);
RepositoryResult<Statement> stmts = conn.getStatements(recordId, null, null);
Model model = QueryResults.asModel(stmts);
stmts.close();
System.out.println(model.size());
}
try (RepositoryConnection conn = repo.getConnection()) {
conn.begin(); // Occurs with any transaction level > NONE
conn.remove((Resource) null, null, null, recordId);
// Clear produces the same result
// conn.clear(recordId);
conn.add(Rio.parse(Main.class.getResourceAsStream("/record_def_change.trig"), RDFFormat.TRIG));
// ********************************************************************************************************
// TODO: NEWLY ADDED QUERY
TupleQuery query = conn.prepareTupleQuery(
"PREFIX dct: <http://purl.org/dc/terms/>\n" +
"\n" +
"SELECT *\n" +
"WHERE {\n" +
" ?record dct:issued ?issued;\n" +
" dct:modified ?modified .\n" +
"}");
TupleQueryResult result = query.evaluate();
if (result.hasNext()) {
System.out.println(result.next().getBinding("issued"));
}
result.close();
conn.commit(); // Same behavior if moved below last retrieval
// ********************************************************************************************************
RepositoryResult<Statement> recordGraph = conn.getStatements(null, null, null, recordId);
Model resultFinal = QueryResults.asModel(recordGraph);
recordGraph.close();
System.out.println(resultFinal.size());
resultFinal.forEach(System.out::println);
}
repoDir.delete();
}
I have ran a quick verification and I seem to be able to reproduce the problem. Have you been able to run variants? For example, is the problem something that only occurs when the graph name and the subject IRI are identical?
I tested a couple of variants and was only seeing this removal behavior when the graph and subject IRI are identical. It didn't occur for other subject IRIs in the graph. Nor did it occur for statements whose subject or predicate were the graph name and were queried for those subject/predicates.
@jeenbroekstra does this bug also apply to 4.0.0-M3? Is it something that can/should be fixed before 4.0.0?
Things work as expected if I retrieve all the statements and remove them as I iterate on them.
Change the conn.clear(recordId)
/conn.remove((Resource) null, null, null, recordId)
to a conn.getStatements(null, null, null, resourceId).forEach(conn::remove);
.
@jeenbroekstra does this bug also apply to 4.0.0-M3? Is it something that can/should be fixed before 4.0.0?
I think so, yes. Though it seems sufficiently like a corner case that it's not necessarily a blocker.