qlever-control
qlever-control copied to clipboard
IndexBuilderMain crashes due to incorrect --stxxl-memory parameter format
Issue: IndexBuilderMain
crashes due to incorrect --stxxl-memory
parameter format
Description
When running the qlever index
command using the qlever_control
framework, I encountered an issue where the indexing process crashes without providing a clear error message. After detailed logging and investigation, I discovered that the problem is caused by the --stxxl-memory
parameter format. If the memory value is specified with the "G" suffix (e.g., 5G
), the process crashes. The parameter should be provided as a plain number without the "G" (e.g., 5
).
Steps to Reproduce
-
Set up the QLever environment and prepare the QLeverfile with the following configuration:
[data] NAME = oc_meta BASE_URL = https://w3id.org/oc/meta DESCRIPTION = OpenCitations Meta stores and delivers bibliographic metadata for all publications involved in the OpenCitations Index. TEXT_DESCRIPTION = All literals, search with FILTER CONTAINS(?var, "...") [index] INPUT_FILES = qlever_input_openalex/* CAT_INPUT_FILES = find qlever_input_openalex/ -type f | xargs cat SETTINGS_JSON = { "ascii-prefixes-only": false, "num-triples-per-batch": 100000 } TEXT_INDEX = from_literals [server] PORT = 7006 MEMORY_FOR_QUERIES = 5G CACHE_MAX_SIZE = 2G TIMEOUT = 30s [runtime] SYSTEM = docker IMAGE = docker.io/adfreiburg/qlever:latest [ui] UI_CONFIG = oc_meta
-
Run the
qlever index
command.
Expected Behavior
The indexing process should complete successfully without crashing.
Actual Behavior
The process crashes, and the following error message is logged:
Error in command-line argument: bad lexical cast: source type value could not be interpreted as target
Options for IndexBuilderMain:
...
-m [ --stxxl-memory-gb ] arg The amount of memory in GB to use for
sorting during the index build.
Decrease if the index builder runs out
of memory.
...
Investigation and Findings
Detailed logging revealed that the --stxxl-memory
parameter should be provided without the "G" suffix. The following command works as expected:
IndexBuilderMain -F ttl -f - -i oc_meta -s oc_meta.settings.json --text-words-from-literals --stxxl-memory 5 | tee oc_meta.index-log.txt
Suggested Fix
Modify the QLeverfile configuration or the script to remove the "G" suffix from the --stxxl-memory parameter. Ensure that the value is passed as a plain number.
index_cmd = f"{args.cat_input_files} | {args.index_binary} -F ttl -f - -i {args.name} -s {args.name}.settings.json --text-words-from-literals --stxxl-memory {args.stxxl_memory.replace('G', '')} | tee {args.name}.index-log.txt"
Alternatively, update the provided version of the QLeverfile to include the STXXL_MEMORY parameter without the "G" suffix, or handle this within the code to prevent similar issues for other users.
Environment
- QLever Control Version: Latest
- Operating System: Debian 12