virtuoso-opensource icon indicating copy to clipboard operation
virtuoso-opensource copied to clipboard

Query engine does not perform correctly comparison of String literals from two graphs populated differently

Open E-Babkin opened this issue 1 year ago • 4 comments

Inspired by the following Stackoverflow discussion : https://stackoverflow.com/questions/78768869/inside-virtuoso-named-graphs-can-rdf-string-literals-be-represented-in-different?noredirect=1#comment139010223_78768869

Requested details:

Version: 07.20.3240

A test case:

  1. create a simple ttl file with one triple and a literal object of String type.

    @prefix m <http://example.str.com> .
    m:ID1111 m:reflect "ROAM-1234".
    
  2. load that ttl file using Virtuoso web console to the graph urn:model1

  3. load the same ttl file using Java Jena library to the graph urn:model2

  4. make the following SPARQL query:

    select DISTINCT ?s ?s2 ?o
    WHERE { 
        GRAPH <urn:model1>
        {
            ?s m:reflect ?o.
        }
        GRAPH <urn:model2>
        {
            ?s2 m:reflect ?o.
        }
    }
    

A non empty result is expected, however the query returns empty result.

A small modification (add FILTER clause) gives correct answer:

select DISTINCT ?s ?s2 ?o
    WHERE { 
        GRAPH <urn:model1>
        {
            ?s m:reflect ?o.
        }
        GRAPH <urn:model2>
        {
            ?s2 m:reflect ?o2.
        }

     FILTER(str(?o) = str(?o2))
    }

E-Babkin avatar Aug 08 '24 14:08 E-Babkin

What is the Virtuoso version in use?

Do you have a test case for recreating?

HughWilliams avatar Aug 08 '24 16:08 HughWilliams

details were added to the main text.

E-Babkin avatar Aug 09 '24 09:08 E-Babkin

How exactly are you loading the data with Jena? Can you provide a runnable program that can be used for loading the data, with the specific method being used?

HughWilliams avatar Aug 09 '24 11:08 HughWilliams

Here is a fragment of the actual Java Code from Spring App.

import java.io.InputStream;

import java.util.ArrayList;
import java.util.Date;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Optional;
import java.util.Random;
import java.util.concurrent.ThreadLocalRandom;

import org.apache.commons.lang3.RandomStringUtils;

import org.apache.jena.graph.Node;
import org.apache.jena.graph.NodeFactory;
import org.apache.jena.graph.Triple;
import org.apache.jena.query.Query;
import org.apache.jena.query.QueryFactory;
import org.apache.jena.query.QuerySolution;
import org.apache.jena.query.ResultSet;
import org.apache.jena.rdf.model.Model;
import org.apache.jena.rdf.model.RDFNode;
import org.apache.jena.rdf.model.Statement;
import org.apache.jena.riot.Lang;
import org.apache.jena.riot.RDFLanguages;
import org.apache.jena.riot.RDFParser;
import org.apache.jena.riot.system.ErrorHandlerFactory;
import org.apache.jena.riot.system.StreamRDF;

import org.springframework.beans.factory.annotation.Value;
import org.springframework.stereotype.Service;
import org.springframework.web.multipart.MultipartFile;

import virtuoso.jena.driver.VirtDataset;
import virtuoso.jena.driver.VirtGraph;
import virtuoso.jena.driver.VirtIsolationLevel;
import virtuoso.jena.driver.VirtModel;
import virtuoso.jena.driver.VirtStreamRDF;
import virtuoso.jena.driver.VirtuosoQueryExecution;
import virtuoso.jena.driver.VirtuosoQueryExecutionFactory;
import virtuoso.jena.driver.VirtuosoUpdateFactory;
import virtuoso.jena.driver.VirtuosoUpdateRequest;


@Service
public class VirtuosoDbDataServiceImpl implements DbDataService {

  private static final Lang FILE_TYPE = RDFLanguages.TTL;
  private static final int BATCH_SIZE = 5000;
  private static final boolean IS_USE_AUTO_COMMIT = false;
  private static final VirtIsolationLevel ISOLATION_LEVEL = VirtIsolationLevel.REPEATABLE_READ;
  private static final int CONCURRENCY = VirtGraph.CONCUR_DEFAULT;


  @Value("${app.host}")
  private String dbHost;

  @Value("${app.user}")
  private String user;

  @Value("${app.password}")
  private String password;
  
  
  public void insertDataFromFile(MultipartFile file, String graphName, Boolean isClearGraph)
      throws Exception {
    try {

      VirtDataset virtDataset = new VirtDataset(dbHost, user, password);
      virtDataset.setIsolationLevel(ISOLATION_LEVEL);
      VirtModel virtModel = (VirtModel) virtDataset.getNamedModel(graphName);

      if (isClearGraph) {
        virtModel.removeAll();
      }

      virtModel.setConcurrencyMode(CONCURRENCY);
      StreamRDF writer = virtModel.getStreamRDF(IS_USE_AUTO_COMMIT, BATCH_SIZE,
          new VirtStreamRDF.DeadLockHandler(0));
      InputStream inputStream = file.getInputStream();
      RDFParser parser = RDFParser.create().source(inputStream).lang(FILE_TYPE)
          .errorHandler(ErrorHandlerFactory.errorHandlerWarn).build();

      parser.parse(writer);
      inputStream.close();
      virtDataset.close();
    } catch (Exception e) {
      throw new Exception(e.getMessage());
    }
  }
  
...
}

E-Babkin avatar Aug 09 '24 12:08 E-Babkin