Stop/inspect running queries
I'm triggering SPARQL queries against the Virtuoso /sparql endpoint via the Python package SPARQLWrapper:
sparql = SPARQLWrapper(endpoint="http://graph:8890/sparql", returnFormat=return_format, defaultGraph=default_graph_uri)
sparql.setQuery(query)
sparql.setMethod(POST)
Sometimes I have SPARQL queries that have a very long running time and therefore I would like in some cases to check if such queries are still running and eventually stop them. Is there a way to inspect running queries and to stop a specific query while is running?
Background: the queries are triggered from a Docker container running a FastAPI application which invokes the /sparql endpoint in another Docker container. The Virtuoso database is based on Docker image virtuoso/virtuoso-opensource-7:7.2.13-r19-g8273aad-alpine.
Thank you for the support.
Have you tried the status SQL function, probably with the c option, i.e., status('c')?
Doesn't seem to work. I can see running statements like rdfs_rule_set, but when I have a query running it does not show up. Example result:
SQL> status('c');
REPORT
VARCHAR
_______________________________________________________________________________
OpenLink Virtuoso Server
Version 07.20.3240-pthreads for Linux as of Jun 10 2024 (a1fd8195bf)
Started on: 2025-01-31 07:15 GMT+0 (up 00:15)
CPU: 99.90% RSS: 4662MB VSZ: 4702MB PF: 0
Database Status:
File size 1564475392, 190976 pages, 97663 free.
170392 buffers, 89446 used, 2 dirty 0 wired down, repl age 339886 0 w. io 0 w/crsr.
Disk Usage: 877 reads avg 0 msec, 0% r 0% w last 6 s, 210615 writes flush 170.6 MB/s,
28 read ahead, batch = 13. Autocompact 47384 in 31363 out, 33% saved col ac: 110260 in 19% saved.
Gate: 33 2nd in reads, 0 gate write waits, 0 in while read 0 busy scrap.
Log = ../database/virtuoso.trx, 417 bytes
91562 pages have been changed since last backup (in checkpoint state)
Current backup timestamp: 0x0000-0x00-0x00
Last backup date: unknown
Clients: 42 connects, max 5 concurrent
RPC: 604 calls, 1 pending, 4 max until now, 0 queued, 24 burst reads (3%), 0 second 11M large, 1020M max
Checkpoint Remap 1710 pages, 0 mapped back. 1 s atomic time.
DB master 190976 total 97663 free 1710 remap 0 mapped back
temp 768 total 763 free
Lock Status: 0 deadlocks of which 0 2r1w, 32 waits,
Currently 2 threads running 0 threads waiting 0 threads in vdb.
Client 1111:40: Account: dba, 3903 bytes in, 87013 bytes out, 1 stmts.
PID: 1634, OS: unix, Application: unknown, IP#: 127.0.0.1
Transaction status: PENDING, 1 threads.
Locks:
Running Statements:
Time (msec) Text
8 status('c')
33 Rows. -- 20 msec.
When you are using the SPARQL endpoint, you need to run the following command:
SQL> status('exec');
30334 sparql {
#output-format:application/sparql-results+json
SELECT DISTINCT ?p ?abs
WHERE {
?p rdf:type dbo:Person.
?p foaf:name ?name.
?p dbo:abstract ?abs.
?p owl:sameAs ?same.
FILTER langMatches( lang(?name), "en" ).
FILTER langMatches( lang(?abs), "en" ).
FILTER (?same IN (<https://d-nb.info/gnd/1065186630/about>,<https://orcid.org/0000-0001-7714-9602>,<http://viaf.org/viaf/313429028>)).
}
}
15 Rows. -- 2 msec.
SQL>
The number on the first line shows how many milliseconds this query has run for so far.
There is currently no way to kill individual queries, as they can be running in parallel on multiple threads.
The only way to stop these queries is the txn_killall(6) command[^1]; however this will terminate all pending transactions in your database, with a Transaction rolled back due to previous SQL error.
If you are running an open SPARQL endpoint, such as we do with DBpedia, you may want to consider enabling the ANYTIME functionality on it [^2].
[^1]: Virtuoso Online Documentation - txn_killal()
[^2]: Community forum document on Query Scalability
I think the command status('exec') works if I execute some queries via the isql interface for example. In my case I'm firing queries against the Virtuoso exposed /sparql endpoint with a pyodbc connection. In this case, even if I have a long running query currently running, the status('exec') command does not show anything.
If I define the following parameter in my virtuoso.ini file QueryLog = "virtuoso.qrl" I'm then able to see the results of executed queries by executing the following SQL query
SELECT ql_id,
ql_start_dt,
ql_rt_msec,
ql_rt_clocks,
ql_client_ip,
ql_user,
ql_sqlstate,
ql_error,
ql_text
FROM DB.DBA.SYS_QUERY_LOG
WHERE qrl_file = 'virtuoso.qrl' and ql_user='SPARQL';
The problem here is that the query returns results only for terminated queries. In my case I would like to be able to see the SPARQL queries that are still running and eventually kill them by their ql_id. Any idea on how to achieve this? Thank you.