deepvariant icon indicating copy to clipboard operation
deepvariant copied to clipboard

Merging vcf files error with glnexus:v1.2.7

Open poddarharsh15 opened this issue 9 months ago • 5 comments

**Have you checked the FAQ?

Describe the issue: Merging vcf files error. Setup

  • Operating system: working on cluster
  • DeepVariant version:latest
  • Installation method (Docker):
  • Type of data: (GIAB AshkenazimTrio [HG002,HG003,HG004] analysis.)

Steps to reproduce:

  • Command:
udocker run \
-v "${PWD}/output":"/output" \
quay.io/mlin/glnexus:v1.2.7 \
/usr/local/bin/glnexus_cli \
--config DeepVariant_unfiltered \
/output/HG002.g.vcf.gz \
/output/HG003.g.vcf.gz \
/output/HG004.g.vcf.gz \
| udocker run -i google/deepvariant:deeptrio-"${BIN_VERSION}" \
  bcftools view - \
| udocker run -i google/deepvariant:deeptrio-"${BIN_VERSION}" \
  bgzip -c > output/HG002_trio_merged.vcf.gz
  • Error trace: (if applicable)

Num BCF records read 118736378 query hits 14552613 [E::bgzf_read_block] Invalid BGZF header at offset 265038798 [E::bgzf_read] Read block operation failed with error 2 after 0 of 32 bytes [E::bgzf_read] Read block operation failed with error 3 after 0 of 32 bytes Error: BCF read err

Screenshot from 2024-05-06 15-00-29

poddarharsh15 avatar May 06 '24 13:05 poddarharsh15

Looks like this is the GLnexus question. Could you please post the question at GLNexus page Also, from the log output it looks like GLnexus was completed successfully.

akolesnikov avatar May 06 '24 16:05 akolesnikov

Hi @poddarharsh15

Actually, can you go back in your log and confirm that DeepTrio runs actually finish correctly?

If I remember correctly, our run_deeptrio one-step script might continue to run the following steps even when previous steps failed.

pichuan avatar May 06 '24 17:05 pichuan

And, follow up on @akolesnikov 's point, if you have gotten to this point, it would seem like these files should be complete?

/output/HG002.g.vcf.gz \
/output/HG003.g.vcf.gz \
/output/HG004.g.vcf.gz \

If you can examine those files and confirm, that will be great. (Or look at the log like I mentioned before. But given you have the files, checking the files directly might be easier :))

pichuan avatar May 06 '24 18:05 pichuan

Hi @pichuan, @akolesnikov,

I'm new to DeepTrio and couldn't locate the log files, but I have intermediate results showing that DeepTrio ran successfully without errors. Additionally, I successfully benchmarked the .vcf files generated by DeepTrio. I've attached screenshots for reference. Your assistance is greatly appreciated. Thank you

finished log Screenshot from 2024-05-07 09-52-02 Screenshot from 2024-05-07 09-52-32

Benchmark Screenshot from 2024-05-07 09-52-59

poddarharsh15 avatar May 07 '24 07:05 poddarharsh15

Hi @poddarharsh15 , it seems like you're certain that the DeepTrio run finished correctly. In that case, I agree with @akolesnikov 's original assessment that this can be an issue for the downstream glnexus step, which we can't directly support.

One suggestion to try: If you need to check your run a bit more closely, maybe breaking it down to just running this part first:

udocker run \
-v "${PWD}/output":"/output" \
quay.io/mlin/glnexus:v1.2.7 \
/usr/local/bin/glnexus_cli \
--config DeepVariant_unfiltered \
/output/HG002.g.vcf.gz \
/output/HG003.g.vcf.gz \
/output/HG004.g.vcf.gz

before piping to the next step. Maybe that could help you identify what the errors are coming out from that step?

pichuan avatar May 07 '24 17:05 pichuan

Closing the issue. Feel free to reopen as needed.

akolesnikov avatar May 15 '24 19:05 akolesnikov