rdf4j icon indicating copy to clipboard operation
rdf4j copied to clipboard

SPARQLRepository should use Java 9 Cleaner to shutDown() dependentClient

Open jjkoehorst opened this issue 2 years ago • 5 comments

Current Behavior

When running a lot of queries the thread count seems to be increasing. Either I am forgetting to close something or something is happening in the backend?

    @Test
    void queryThread() {
        while (true) {
            ThreadMXBean threadBean = ManagementFactory.getThreadMXBean();
            int threadCount = threadBean.getThreadCount();
            System.out.println("Number of threads: " + threadCount);

            // Execute a SPARQL query to obtain genome information such as completeness etc...
            RepositoryConnection conn = new SPARQLRepository("http://nvme1.wurnet.nl:7200/repositories/gca").getConnection();
            String query = "select distinct ?s ?p ?o where { ?s ?p ?o } LIMIT 1";
            TupleQuery tupleQuery = conn.prepareTupleQuery(query);
            TupleQueryResult tupleQueryResult = tupleQuery.evaluate();
            for (BindingSet bindings : tupleQueryResult) {}
            tupleQueryResult.close();
            tupleQuery.clearBindings();
            conn.close();
        }
    }

Expected Behavior

Thread count should stay low but it keeps increasing

Number of threads: 12
Number of threads: 14
Number of threads: 16
Number of threads: 18
Number of threads: 20

Steps To Reproduce

Run the test

    @Test
    void queryThread() {
        while (true) {
            ThreadMXBean threadBean = ManagementFactory.getThreadMXBean();
            int threadCount = threadBean.getThreadCount();
            System.out.println("Number of threads: " + threadCount);

            // Execute a SPARQL query to obtain genome information such as completeness etc...

            RepositoryConnection conn = new SPARQLRepository("http://nvme1.wurnet.nl:7200/repositories/gca").getConnection();

            String query = "select distinct ?s ?p ?o where { ?s ?p ?o } LIMIT 1";
            TupleQuery tupleQuery = conn.prepareTupleQuery(query);
            TupleQueryResult tupleQueryResult = tupleQuery.evaluate();
            for (BindingSet bindings : tupleQueryResult) {}
            
            tupleQueryResult.close();
            tupleQuery.clearBindings();
            
            conn.close();
        }
    }

Using: implementation 'org.eclipse.rdf4j:rdf4j-runtime:4.3.2'

Version

4.3.2

Are you interested in contributing a solution yourself?

Sure

Anything else?

No

jjkoehorst avatar Jul 11 '23 07:07 jjkoehorst

You need to call .shutdown() on the SPARQLRepository.

SPARQLRepository sparqlRepository = new SPARQLRepository(".....");

...

sparqlRepository.shutDown();

hmottestad avatar Jul 11 '23 13:07 hmottestad

Clear!

Number of threads: 10 Number of threads: 10 Number of threads: 10 Number of threads: 10

Thanks!

jjkoehorst avatar Jul 11 '23 13:07 jjkoehorst

I believe that this is technically a bug since the SPARQLRepository shouldn't leak resources even though someone forgets to class shutDown().

We should use Java 9 cleaner to call the code in shutDownInternal() if the SPARQLRepository object is no longer reachable.

hmottestad avatar Jul 11 '23 13:07 hmottestad

Screenshot 2023-07-11 at 15 55 45

The thread "Connection evictor" is the one that sticks around.

Screenshot 2023-07-11 at 15 56 00

It's created by the Apache HTTP Client.

hmottestad avatar Jul 11 '23 13:07 hmottestad

More than happy to help if I can test / contribute in any way.

jjkoehorst avatar Jul 11 '23 14:07 jjkoehorst