Indigo
Indigo copied to clipboard
Segfault while rebuilding index on molfile using bingo.molecule type
Steps to Reproduce
- Use Indigo library (Bingo cartridge). Describe environment *Note: this issue is not 100% reproducible in our environment with a given set of data. It only seems to happen maybe 1/5 or 1/8 of the time.
OS: (output from uname -a): Linux bpeqabirdmlapvm02 4.18.0-240.15.1.el8_3.x86_64 #1 SMP Mon Mar 1 17:16:16 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
CentOS Linux release 8.3.2011
32 Gb RAM 8 CPU's
Output from "select version();":
PostgreSQL 12.9 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-4), 64-bit
276 Gb available on the filesystem on which PostgreSQL keeps its data and logs.
- Add script or SQL to reproduce the issue See attached file.
Actual behavior During or immediately after the rebuild of the bingo index, the OS records a segfault. The PostgreSQL log records a corrupted double-linked list, then PostgreSQL terminates and restarts.
Attached is an excerpt from /var/log/messages. messages.txt
Attached is an excerpt of the PostgreSQL log: (note that the stored procedure source.uspupsertmolecule executes the CREATE INDEX statement that seems to be failing. The next stored proc in the pipeline is source.uspupsertlot).
Note the error "corrupted double-linked list".
Expected behavior The bingo index should be rebuilt with no segfaults thrown, and the pipeline should continue on afterwards as normal. No "corrupted double-linked list" should be mentioned in the PostgreSQL log.
Environment details:
- Back-end version Bingo 1.9.0
Attachments Three attachments included.
Additional context Add any other context about the problem here.
We are in the process of upgrading to the latest version of bingo, but reading through the release notes, I did not see any fixed bugs that looked like the issue we're experiencing. The closest ones had to do with CDX files, and we are parsing molfiles instead.
Fixed in 1.10:
#1068 CDX-loader crash
Fixed in 1.12:
#1126 Segfault when iterating CDX file from USPTO downloads
Fixed in 1.13:
#1139 core dumped when reading CDX file downloaded from USPTO
1. Description:
The issue involves a segmentation fault (segfault) occurring during or immediately after rebuilding the Bingo index in PostgreSQL, leading to a corrupted double-linked list error, database termination, and subsequent restart. This affects the stability of the database when processing MOL files via the source.uspupsertmolecule stored procedure. A hypothetical cause could be a memory corruption or buffer overflow in the Bingo cartridge (version 1.9.0) when handling large MOL files or complex indexing operations.
2. Steps to Reproduce (Steps to Reproduce):
- Step 1: Set up the environment on CentOS Linux release 8.3.2011
- Step 2: Create the
regmol.parent_structuretable and populate it with 211,000 records, including MOL files (longest 41,000 characters), using the script fromscript.txt. - Step 3: Execute the SQL statement
CREATE INDEX idx_bingo_bingo_config ON source.parent_structure USING bingo_idx (mol_file COLLATE pg_catalog."default" bingo.molecule)within thesource.uspupsertmoleculestored procedure. - Step 4: Monitor the system logs (
/var/log/messages) and PostgreSQL logs (postgresql-log.txt) for segfaults or errors during index creation. - Note: The issue is not 100% reproducible, occurring approximately 1/5 to 1/8 of the time with the given dataset.
3. Expected Behavior (Expected Behavior):
The Bingo index should be rebuilt successfully without throwing segfaults, and the pipeline (including source.uspupsertmolecule and subsequent source.uspupsertlot) should continue normally. The PostgreSQL log should not contain errors such as "corrupted double-linked list."
4. Actual Behavior (Actual Behavior):
During or after the CREATE INDEX operation, a segfault is recorded in the kernel log (messages.txt), followed by a "corrupted double-linked list" error in the PostgreSQL log (postgresql-log.txt). This leads to PostgreSQL terminating (PID 197922) with signal 6 (Aborted), triggering a restart and recovery mode. The error reproduces intermittently (1/5 to 1/8 of attempts). Logs show:
- Segfault at address
cinbingo_postgres.so(stack trace inmessages.txt). - Warnings about terminating connections due to shared memory corruption (
postgresql-log.txt). - Recovery mode activation until the database is ready again.
5. Analysis of the Problem (Analysis of the Problem):
The problem likely originates from a bug in the Bingo cartridge (version 1.9.0) during index creation, possibly due to improper memory management or handling of large MOL files (up to 41,000 characters). Key observations:
- The segfault in
bingo_postgres.sosuggests a memory access violation, potentially a buffer overflow or uninitialized pointer. - The "corrupted double-linked list" indicates heap corruption, which could result from invalid memory writes during MOL file processing or indexing.
- The warning "stereogroup number 8 out of range" hints at a parsing error in stereochemical data, which may trigger the crash.
- Involved modules: Bingo PostgreSQL extension (
bingo_postgres.so), PostgreSQL backend. Confirmation from a business analyst is needed to validate MOL file content and stereochemistry. - The issue’s intermittency (1/5 to 1/8) may be tied to memory allocation patterns or specific MOL file characteristics.
6. Suggested Solutions (Suggested Solutions):
- High-level solution: Implement input validation and limit checks for MOL files (e.g., size and stereochemistry) before indexing to prevent memory corruption.
- Technical solution: Update to a newer Bingo version (e.g., 1.10 or later) and apply a patch to handle stereogroup out-of-range errors. Example workaround:
ReplaceCREATE OR REPLACE FUNCTION safe_upsertmolecule(varchar, varchar) RETURNS void AS $$ BEGIN -- Validate MOL file size and structure before indexing IF LENGTH($1) > 40000 THEN RAISE NOTICE 'MOL file too large, skipping index creation'; RETURN; END IF; EXECUTE 'CREATE INDEX idx_bingo_bingo_config ON source.parent_structure USING bingo_idx (mol_file COLLATE pg_catalog."default" bingo.molecule)'; EXCEPTION WHEN OTHERS THEN RAISE NOTICE 'Index creation failed: %', SQLERRM; END; $$ LANGUAGE plpgsql;source.uspupsertmoleculewith this function. - Documentation improvement: Update Bingo documentation to highlight known issues with large MOL files or stereochemistry in version 1.9.0.