gatk icon indicating copy to clipboard operation
gatk copied to clipboard

GenomicsDBImport datastore format folder permissions | cause for ERROR: Couldn't create GenomicsDBFeatureReader

Open vidprijatelj opened this issue 1 year ago • 17 comments

Bug Report

Affected tool(s) or class(es)

GenomicsDBImport / GenotypeGVCFs

Affected version(s)

4.3.0.0

Description

When creating a GenomicsDB datastore, the created folder has permissions set to 700 (recursivelly). As such, when trying to jointly calling genotypes using the GenotypeGVCFs, one encounters error: ERROR: Couldn't create GenomicsDBFeatureReader

Steps to reproduce

  • Create a datastore using GenomicsDBImport, e.g. gatk ... --genomicsdb-workspace-path IWANNAKILLYOU

  • Recursively change access permission to the thus created genomicsdb chmod 700 -R ./IWANNAKILLYOU

  • Run the GenotypeGVCFs gatk ... --variant gendb://IWANNAKILLYOU

Expected behavior

GenotypeGVCFs should initialize the engine normally and start processing the intervals as expected

Actual behavior

GenotypeGVCFs intializes the engine and throws out and error ERROR: Couldn't create GenomicsDBFeatureReader

Proposed solution

Mention anywhere in the docs the genomicsdb datastore should be made readable to other users, i.e., change permissions to at least 744 if not do a 766. Or just make sure the ./IWANNAKILLYOU has proper permissions from the get go.

Much obliged

vidprijatelj avatar Mar 02 '23 16:03 vidprijatelj

@vidprijatelj Thanks for the report! Can you check the UMASK value in your shell? You can do this by simply typing the command umask. If it's set to something like 0077, that could explain what you're seeing.

GATK does not, in general, require permissions for users other than the owner of the file/directory, so it's a bit surprising that this is causing issues for you. Could you paste the full stacktrace for the exception you're getting? You may need to set GATK_STACKTRACE_ON_USER_EXCEPTION=true in your environment in order to get GATK to print the stack trace.

droazen avatar Mar 13 '23 19:03 droazen

@droazen Thanks for the reply! Certainly. umask returns 0022. As such I reckon that is not the issue.

Stacktrace in the bottom.

The folder permission of the datastore folder is as follows: drwx--S---+ 26 vidprijatelj group 4096 Mar 14 15:29 Vid_database

When changing to 766, the error disappears.

Tue Mar 14 15:37:57 CET 2023
Using GATK jar /appl/tools/versions/gatk-4.3.0.0/gatk-package-4.3.0.0-local.jar
Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Djava.io.tmpdir=zzz_tmpdir -Xmx128G -DGATK_STACKTRACE_ON_USER_EXCEPTION=true -jar /appl/tools/versions/gatk-4.3.0.0/gatk-package-4.3.0.0-local.jar GenotypeGVCFs --reference /data/Scratch/References/ucsc.hg38.fa --variant gendb://Vid_database --output Step05_MultiSampleCalling/Vid.vcf.gz --intervals /data/Scratch/References/hg38_exome_v2.0.2_merged_probes_sorted_validated.annotated.bed --genomicsdb-shared-posixfs-optimizations True --merge-input-intervals
15:37:59.895 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/appl/tools/versions/gatk-4.3.0.0/gatk-package-4.3.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
15:38:00.018 INFO  GenotypeGVCFs - ------------------------------------------------------------
15:38:00.018 INFO  GenotypeGVCFs - The Genome Analysis Toolkit (GATK) v4.3.0.0
15:38:00.018 INFO  GenotypeGVCFs - For support and documentation go to https://software.broadinstitute.org/gatk/
15:38:00.018 INFO  GenotypeGVCFs - Executing as user@server
15:38:00.018 INFO  GenotypeGVCFs - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_362-b08
15:38:00.019 INFO  GenotypeGVCFs - Start Date/Time: March 14, 2023 3:37:59 PM CET
15:38:00.019 INFO  GenotypeGVCFs - ------------------------------------------------------------
15:38:00.019 INFO  GenotypeGVCFs - ------------------------------------------------------------
15:38:00.019 INFO  GenotypeGVCFs - HTSJDK Version: 3.0.1
15:38:00.019 INFO  GenotypeGVCFs - Picard Version: 2.27.5
15:38:00.019 INFO  GenotypeGVCFs - Built for Spark Version: 2.4.5
15:38:00.019 INFO  GenotypeGVCFs - HTSJDK Defaults.COMPRESSION_LEVEL : 2
15:38:00.019 INFO  GenotypeGVCFs - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
15:38:00.020 INFO  GenotypeGVCFs - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
15:38:00.020 INFO  GenotypeGVCFs - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
15:38:00.020 INFO  GenotypeGVCFs - Deflater: IntelDeflater
15:38:00.020 INFO  GenotypeGVCFs - Inflater: IntelInflater
15:38:00.020 INFO  GenotypeGVCFs - GCS max retries/reopens: 20
15:38:00.020 INFO  GenotypeGVCFs - Requester pays: disabled
15:38:00.020 INFO  GenotypeGVCFs - Initializing engine
15:38:00.590 INFO  GenomicsDBLibLoader - GenomicsDB native library version : 1.4.3-6069e4a
15:38:00.652 INFO  GenotypeGVCFs - Shutting down engine
[March 14, 2023 3:38:00 PM CET] org.broadinstitute.hellbender.tools.walkers.GenotypeGVCFs done. Elapsed time: 0.01 minutes.
Runtime.totalMemory()=2326265856
***********************************************************************

A USER ERROR has occurred: Couldn't create GenomicsDBFeatureReader

***********************************************************************
org.broadinstitute.hellbender.exceptions.UserException: Couldn't create GenomicsDBFeatureReader
        at org.broadinstitute.hellbender.engine.FeatureDataSource.getGenomicsDBFeatureReader(FeatureDataSource.java:463)
        at org.broadinstitute.hellbender.engine.FeatureDataSource.getFeatureReader(FeatureDataSource.java:365)
        at org.broadinstitute.hellbender.engine.FeatureDataSource.<init>(FeatureDataSource.java:319)
        at org.broadinstitute.hellbender.engine.FeatureDataSource.<init>(FeatureDataSource.java:291)
        at org.broadinstitute.hellbender.engine.VariantLocusWalker.initializeDrivingVariants(VariantLocusWalker.java:76)
        at org.broadinstitute.hellbender.engine.VariantWalkerBase.initializeFeatures(VariantWalkerBase.java:67)
        at org.broadinstitute.hellbender.engine.GATKTool.onStartup(GATKTool.java:726)
        at org.broadinstitute.hellbender.engine.VariantLocusWalker.onStartup(VariantLocusWalker.java:63)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:138)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
        at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
        at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
        at org.broadinstitute.hellbender.Main.main(Main.java:289)
Caused by: java.io.IOException: GenomicsDB JNI Error: vector::_M_default_append
        at org.genomicsdb.reader.GenomicsDBQueryStream.jniGenomicsDBInit(Native Method)
        at org.genomicsdb.reader.GenomicsDBQueryStream.<init>(GenomicsDBQueryStream.java:209)
        at org.genomicsdb.reader.GenomicsDBQueryStream.<init>(GenomicsDBQueryStream.java:182)
        at org.genomicsdb.reader.GenomicsDBQueryStream.<init>(GenomicsDBQueryStream.java:91)
        at org.genomicsdb.reader.GenomicsDBFeatureReader.generateHeadersForQuery(GenomicsDBFeatureReader.java:200)
        at org.genomicsdb.reader.GenomicsDBFeatureReader.<init>(GenomicsDBFeatureReader.java:85)
        at org.broadinstitute.hellbender.engine.FeatureDataSource.getGenomicsDBFeatureReader(FeatureDataSource.java:460)
        ... 13 more

vidprijatelj avatar Mar 14 '23 14:03 vidprijatelj

@nalinigans / @mlathara , any insight into this GenomicsDB JNI Error: vector::_M_default_append error that apparently is related to permissions on the GenomicsDB directory?

droazen avatar Mar 14 '23 14:03 droazen

With --genomicsdb-shared-posixfs-optimizations, the storage system should only require read access. @droazen, will work towards a fix for this.

nalinigans avatar Mar 14 '23 17:03 nalinigans

@vidprijatelj , I can't reproduce the issue on MacOS and Centos 7. Can you provide us with more information with respect to the system you are on? What is the OS? Are there any access control lists setup?

nalinigans avatar Mar 15 '23 16:03 nalinigans

@nalinigans

me@server:~$ cat  /etc/os-release
NAME="CentOS Stream"
VERSION="8"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="8"
PLATFORM_ID="platform:el8"
PRETTY_NAME="CentOS Stream 8"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:8"
HOME_URL="https://centos.org/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux 8"
REDHAT_SUPPORT_PRODUCT_VERSION="CentOS Stream"
me@server:/data/Scratch/Exo-Seq/221108_PracticeVid$ lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              128
On-line CPU(s) list: 0-127
Thread(s) per core:  2
Core(s) per socket:  32
Socket(s):           2
NUMA node(s):        8
Vendor ID:           AuthenticAMD
CPU family:          23
Model:               1
Model name:          AMD EPYC 7601 32-Core Processor
SNIPPED BELOW TLDR

The project dir

me@server:/data/Scratch/Exo-Seq/221108_PracticeVid$ getfacl .
# file: .
# owner: me
# group: groupours
# flags: -s-
user::rwx
group::rwx
other::rwx

GenomicsDB dir // do note I changed permissions as outlined up top.

me@server:/data/Scratch/Exo-Seq/221108_PracticeVid$ getfacl -d ./Vid_database/
# file: Vid_database/
# owner: me
# group: ourgroup
# flags: -s-

Hopefully this helps.

vidprijatelj avatar Mar 16 '23 15:03 vidprijatelj

Thanks @vidprijatelj. I see the sticky bit being used for groups for the workspace - # flags: -s-. That, by itself, seems to be OK, that is I am not able to reproduce the issue. But it looks like std::vector is not able to resize - Caused by: java.io.IOException: GenomicsDB JNI Error: vector::_M_default_append. What are the permissions to your tmp directory? Does it also have the sticky bit set? Even if the workspace only requires read permissions, GenomicsDB and probably the underlying standard C++ runtime may require write access to tmp and the sticky bit may be affecting the execution.

Also, can you please confirm that the user creating the workspace and the user reading from the workspace are the same?

nalinigans avatar Mar 20 '23 16:03 nalinigans

@nalinigans Hi, apologies for the late reply. The temp dir has the sticky bit as well. Permissions are expanded compared to the workspace - below is the default output without me playing around or changing anything.

me@server:/data/Scratch/Exo-Seq/221108_PracticeVid$ getfacl ./zzz_tmpdir/
# file: zzz_tmpdir/
# owner: me
# group: groupours
# flags: -s-
user::rwx
group::rwx
other::rwx

The user creating the workspace and the user reading from it are identical.

vidprijatelj avatar Mar 30 '23 14:03 vidprijatelj

Hi all, Is this problem solved yet? I have the same error "A USER ERROR has occurred: Couldn't create GenomicsDBFeatureReader".

CHENG-KH avatar Oct 22 '23 04:10 CHENG-KH

@CHENG-KH, are you having GenomicsDBImport datastore format folder permissions as well? Can you follow https://github.com/broadinstitute/gatk/issues/8233#issuecomment-1466807447 and attach the stack trace please?

nalinigans avatar Oct 22 '23 18:10 nalinigans

@nalinigans Hi, apologies for the late reply.

A USER ERROR has occurred: Couldn't create GenomicsDBFeatureReader
org.broadinstitute.hellbender.exceptions.UserException: Couldn't create GenomicsDBFeatureReader
        at org.broadinstitute.hellbender.engine.FeatureDataSource.getGenomicsDBFeatureReader(FeatureDataSource.java:463)
        at org.broadinstitute.hellbender.engine.FeatureDataSource.getFeatureReader(FeatureDataSource.java:365)
        at org.broadinstitute.hellbender.engine.FeatureDataSource.<init>(FeatureDataSource.java:319)
        at org.broadinstitute.hellbender.engine.FeatureDataSource.<init>(FeatureDataSource.java:291)
        at org.broadinstitute.hellbender.engine.VariantLocusWalker.initializeDrivingVariants(VariantLocusWalker.java:76)
        at org.broadinstitute.hellbender.engine.VariantWalkerBase.initializeFeatures(VariantWalkerBase.java:67)
        at org.broadinstitute.hellbender.engine.GATKTool.onStartup(GATKTool.java:726)
        at org.broadinstitute.hellbender.engine.VariantLocusWalker.onStartup(VariantLocusWalker.java:63)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:138)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
        at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
        at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
        at org.broadinstitute.hellbender.Main.main(Main.java:289)
Caused by: java.io.IOException: GenomicsDB JNI Error: Broad combine GVCFs exception : No sample/CallSet name specified in JSON file/Protobuf object for TileDB row 1381
        at org.genomicsdb.reader.GenomicsDBQueryStream.jniGenomicsDBInit(Native Method)
        at org.genomicsdb.reader.GenomicsDBQueryStream.<init>(GenomicsDBQueryStream.java:209)
        at org.genomicsdb.reader.GenomicsDBQueryStream.<init>(GenomicsDBQueryStream.java:182)
        at org.genomicsdb.reader.GenomicsDBQueryStream.<init>(GenomicsDBQueryStream.java:91)
        at org.genomicsdb.reader.GenomicsDBFeatureReader.generateHeadersForQuery(GenomicsDBFeatureReader.java:200)
        at org.genomicsdb.reader.GenomicsDBFeatureReader.<init>(GenomicsDBFeatureReader.java:85)
        at org.broadinstitute.hellbender.engine.FeatureDataSource.getGenomicsDBFeatureReader(FeatureDataSource.java:460)

CHENG-KH avatar Oct 27 '23 07:10 CHENG-KH

Any update on this? I just ran into a form of this problem in the context of some pipeline unit tests. I have a task that runs the following:

gatk GenomicsDBImport \
        --sample-name-map ${sample_map} \
        --genomicsdb-workspace-path ${cohort_name}_gdb \
        --genomicsdb-shared-posixfs-optimizations \
        -L ${interval_list}

gatk GenotypeGVCFs \
        -R ${ref_fasta} \
        -V gendb://${cohort_name}_gdb \
        -O ${cohort_name}.joint.vcf \
        -L ${interval_list} 

Which runs fine, but if I re-run the test suite the system complains it can't delete the gdb workspace. I have to manually sudo rm which is gross. I can work around this by adding either chmod 777 -R ${cohort_name}_gdb or rm -r ${cohort_name}_gdb as a cleanup step, but that seems gross too.

My use case is just a toy example for training purposes, but I worry about what this could mean for a production environment.

Am I missing something?

vdauwera avatar Apr 25 '24 20:04 vdauwera

Is the task using docker as execution environment? If so how is the user and group set for that?

gokalpcelik avatar Apr 25 '24 20:04 gokalpcelik

Yes it is. Honestly not sure on the u/g config, as an end user I'd really rather not have to care about that 😅 This is the only tool causing this kind of issue so it's got to be the tool itself, no?

vdauwera avatar Apr 25 '24 20:04 vdauwera

I believe I've had this issue before but with different tools as well. If you are on nextflow below is a config for scope docker

docker.fixOwnership
Fix ownership of files created by the docker container.

There is also another scope that could be set if there is only a single user

docker.runOptions
This attribute can be used to provide any extra command line options supported by the docker run command. See the [Docker documentation](https://docs.docker.com/engine/reference/run/) for details.

This one enables passing -u parameter to docker directly.

If none of them are set in the nextflow config then I would first suggest these options. If not we can escalate this with the team.

gokalpcelik avatar Apr 25 '24 20:04 gokalpcelik

Oh interesting, thank you. Yes this is a nextflow pipeline. Thanks for the tip! Will report back.

vdauwera avatar Apr 25 '24 20:04 vdauwera

That worked! TIL. Thank you very much!!

vdauwera avatar Apr 25 '24 21:04 vdauwera