smoove icon indicating copy to clipboard operation
smoove copied to clipboard

panic: parsing time during smoove call

Open redincla opened this issue 2 years ago • 6 comments

Hi,

I'm having a weird parsing time issue when running smoove call on individual WGS cram (same on converted bam) samples (running it on population cohort):

singularity run /dcsrsoft/singularity/containers/smoove-0.2.7.sif smoove call   --outdir ./ --exclude /data/exclude.cnvnator_100bp.GRCh38.20170403.bed   --name GVA_VZV_01 --fasta /data/Homo_sapiens_assembly38.fasta   -p 1 --genotype /users/GVA_VZV_01.bam
WARNING: group: unknown groupid 124638
[smoove] 2022/02/21 11:26:26 starting with version 0.2.7
panic: parsing time "201123-01-01T010000+0100" as "2006-01-02T150405": cannot parse "23-01-01T010000+0100" as "-": line 3368: "@RG\tID:GVA_VZV_01.HLCLYDSXY.1\tSM:GVA_VZV_01\tLB:NGS000002043\tPL:illumina\tPU:HLCLYDSXY.1.CCTTCACC+GGAGCGTC\tCN:H2030GC\tDT:201123-01-01T01:00:00+0100"

goroutine 1 [running]:
github.com/brentp/smoove/lumpy.check(...)
	/home/brentp/go/go/src/github.com/brentp/smoove/lumpy/lumpy.go:54
github.com/brentp/smoove/lumpy.lumpy_filter_cmd(0x7ffd8dcf4319, 0xad, 0x7ffd8dcf4202, 0x2, 0x7ffd8dcf42a6, 0x62, 0x0, 0x0, 0x0, 0x0, ...)
	/home/brentp/go/go/src/github.com/brentp/smoove/lumpy/lumpy.go:83 +0x105b
github.com/brentp/smoove/lumpy.Lumpy(0x7ffd8dcf4293, 0xa, 0x7ffd8dcf42a6, 0x62, 0x7ffd8dcf4202, 0x2, 0xc0001447a0, 0x1, 0x1, 0xc000178140, ...)
	/home/brentp/go/go/src/github.com/brentp/smoove/lumpy/lumpy.go:119 +0x12b
github.com/brentp/smoove/lumpy.Main()
	/home/brentp/go/go/src/github.com/brentp/smoove/lumpy/lumpy.go:335 +0x261
main.main()
	/home/brentp/go/go/src/github.com/brentp/smoove/cmd/smoove/smoove.go:121 +0x1ce

Any idea how to fix this? It does create 2 files: GVA_VZV_01.disc.bam.tmp.bam and GVA_VZV_01.split.bam.tmp.bam But I guess these are truncated.

Thanks a lot for your help,

Claire

redincla avatar Feb 21 '22 10:02 redincla

what aligner generated those bam files?

brentp avatar Feb 21 '22 10:02 brentp

Hi again,

I've used bwa-mem So this could be an error from the alignment step?

On Mon, Feb 21, 2022 at 11:38 AM Brent Pedersen @.***> wrote:

what aligner generated those bam files?

— Reply to this email directly, view it on GitHub https://github.com/brentp/smoove/issues/189#issuecomment-1046730452, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACKLJZU6PFHZDARP2V6BK3TU4IIYXANCNFSM5O6FKQTA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you authored the thread.Message ID: @.***>

redincla avatar Feb 21 '22 10:02 redincla

Hi, see the linked issue above. It's an invalid date that can't be parsed by biogo/hts (and therefore by smoove). Maybe you can check the date command on the host machine where bwa mem was run?

brentp avatar Feb 21 '22 11:02 brentp

Hi Brent,

Thanks a lot for the comment. I've thus added an extra step to fix all DT fields from the bam header using 'accepted' biogo/hts DT formats, and it now runs to the next step and generates *disc.bam/bam.cso/bam.orig.bam and *split.bam/split.bam.orig.bam outputs. But I get the following.

singularity run /dcsrsoft/singularity/containers/smoove-0.2.7.sif smoove call \

--outdir ./ --exclude /data/exclude.cnvnator_100bp.GRCh38.20170403.bed
--name GVA_VZV_01 --fasta /data/Homo_sapiens_assembly38.fasta
-p 1 --genotype GVA_VZV_01.bam WARNING: group: unknown groupid 124638 [smoove] 2022/02/22 08:47:09 starting with version 0.2.7 [smoove] 2022/02/22 08:47:09 calculating bam stats for 1 bams [smoove]: ([E]lumpy-filter) 2022/02/22 08:47:09 /bin/bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8) [smoove] 2022/02/22 08:47:50 done calculating bam stats [smoove]:([E]lumpy-filter) 2022/02/22 09:26:06 [lumpy_filter] extracted splits and discordants from 648546193 total aligned reads [smoove]:2022/02/22 09:26:10 finished process: lumpy-filter (set -eu; lumpy_filter -f /scratch) in user-time:15m44.996517s system-time:4m50.279452s bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8) 2022/02/22 09:26:52 couldn't get region from line

Last error message couldn't get region from line seems to fail to proceed to the next genotyping step, as I have no *genotyped.vcf.gz output. Any idea on how I could fix this?

Thanks a ton, and sorry for bothering again,

Claire

On Mon, Feb 21, 2022 at 12:34 PM Brent Pedersen @.***> wrote:

Hi, see the linked issue above. It's an invalid date that can't be parsed by biogo/hts (and therefore by smoove). Maybe you can check the date command on the host machine where bwa mem was run?

— Reply to this email directly, view it on GitHub https://github.com/brentp/smoove/issues/189#issuecomment-1046782724, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACKLJZREXKL3ENMR3TDXGPDU4IPNTANCNFSM5O6FKQTA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you authored the thread.Message ID: @.***>

redincla avatar Feb 23 '22 11:02 redincla

Hi, I don't know why this is happening, but you can get around it by using the --noextrafilters flag to smoove call.

For some reason, it's getting empty lines in some bed output. if you can share a quantized.bed.gz file from a failed run, that might help to understand what's happening.

brentp avatar Feb 23 '22 11:02 brentp

Thanks a ton! I figured that the exclusion bed file I used was corrupted (in case it's helpful for others). Now everything runs smoothly. Have a great day,

Claire

On Wed, Feb 23, 2022 at 12:18 PM Brent Pedersen @.***> wrote:

Hi, I don't know why this is happening, but you can get around it by using the --noextrafilters flag to smoove call.

For some reason, it's getting empty lines in some bed output. if you can share a quantized.bed.gz file from a failed run, that might help to understand what's happening.

— Reply to this email directly, view it on GitHub https://github.com/brentp/smoove/issues/189#issuecomment-1048678664, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACKLJZQ3UBADVS3QTOOULI3U4S663ANCNFSM5O6FKQTA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you authored the thread.Message ID: @.***>

redincla avatar Feb 23 '22 14:02 redincla