ngless icon indicating copy to clipboard operation
ngless copied to clipboard

fd:12: hPutBuf: resource vanished (Broken pipe)

Open uloeber opened this issue 2 years ago • 0 comments

Hi Luis, I currently get an error I cannot trace back and our IT doesn't know what's the issue either. I am running a job on one of our huge nodes 2TB, so memory should not be an issue and get the following error (just removed some personal information (path)):


Exiting after fatal error:
ESC[31mAn unhandled error occurred (this should not happen)!

        If you can reproduce this issue, please run your script
        with the --trace flag and report a bug (including the script and the trace) at
                https://github.com/ngless-toolkit/ngless/issues

The error message was: `fd:12: hPutBuf: resource vanished (Broken pipe)`)```

here are the last lines from the trace log:
[main] CMD:/bactopia-20201013/anaconda3/envs/ngless/bin/bwa mem -t 1 -K 100000000 -p -a /NGLESSmodules/Modules/gmgc.ngm/1.0/cached/gmgc:no-rare.fna.splits_250000m.0-bwa-0.7.17.fna -
[main] Real time: 79285.296 sec; CPU: 79095.875 sec

[Wed 22-06-2022 12:52:27] Line 16: Success
[Wed 22-06-2022 12:52:27] Line 16: Mapped readset stats (/NGLESSmodules/Modules/gmgc.ngm/1.0/cached/gmgc:no-rare.fna.splits_250000m.0.fna):
[Wed 22-06-2022 12:52:27] Line 16: Total reads: 21950778
[Wed 22-06-2022 12:52:27] Line 16: Total reads aligned: 13303553 [60.61%]
[Wed 22-06-2022 12:52:27] Line 16: Total reads Unique map: 7767800 [35.39%]
[Wed 22-06-2022 12:52:27] Line 16: Total reads Non-Unique map: 5535753 [25.22%]
[Wed 22-06-2022 12:52:27] Line 17: Running garbage collection.
[Wed 22-06-2022 12:52:27] Line 17: Interpreting [interpretIO]: gmgc_mapped_post = select(Lookup 'gmgc_mapped' as NGLMappedReadSet)using {Block {blockVariable = Variable "mr", blockBody = Sequence [mr = (Lookup 'mr' as NGLMappedRead).MethodName {unwrapMethodName = "filter"}( Nothing; min_match_size=45; min_identity_pc=95; action={drop} ),if [UOpNot((Lookup 'mr' as NGLMappedRead).MethodName {unwrapMethodName = "flag"}( Just {mapped} ))] then {Sequence [discard]} else {Sequence []}]}}
[Wed 22-06-2022 12:52:27] Line 17: Interpreting [assignment]: select(Lookup 'gmgc_mapped' as NGLMappedReadSet)using {Block {blockVariable = Variable "mr", blockBody = Sequence [mr = (Lookup 'mr' as NGLMappedRead).MethodName {unwrapMethodName = "filter"}( Nothing; min_match_size=45; min_identity_pc=95; action={drop} ),if [UOpNot((Lookup 'mr' as NGLMappedRead).MethodName {unwrapMethodName = "flag"}( Just {mapped} ))] then {Sequence [discard]} else {Sequence []}]}}
[Wed 22-06-2022 12:52:27] Line 17: Executing blocked select on file /temp/mapped_gmgc:no-rare.sam6057-0.zstd
[Wed 22-06-2022 12:52:27] Line 17: Created & opened temporary file //temp/block_selected_mapped_gmgc:no-rare.sam6057-1.zstd
[Wed 22-06-2022 13:00:19] Line 21: Running garbage collection.
[Wed 22-06-2022 13:00:19] Line 21: Interpreting [interpretIO]: temp$4 = mapstats(Lookup 'gmgc_mapped_post' as NGLMappedReadSet)
[Wed 22-06-2022 13:00:19] Line 21: Interpreting [assignment]: mapstats(Lookup 'gmgc_mapped_post' as NGLMappedReadSet)
[Wed 22-06-2022 13:00:19] Line 21: Computing mapstats on File /temp/block_selected_mapped_gmgc:no-rare.sam6057-1.zstd
[Wed 22-06-2022 13:03:03] Line 21: Created & opened temporary file /temp/sam_stats_block_selected_mapped_gmgc:no-rare6057-2.stats
[Wed 22-06-2022 13:03:03] Line 21: Running garbage collection.
[Wed 22-06-2022 13:03:03] Line 21: Interpreting [interpretIO]: __check_ofile(BinaryOp("gmgc" -BOpPathAppend- BinaryOp(Lookup 'RESULTS' as NGLString -BOpAdd- "gmgc_norare.stats.txt")); original_lno=21)
[Wed 22-06-2022 13:03:03] Line 21: Interpreting [executing module function: '__check_ofile']: NGOString "gmgc/D16gmgc_norare.stats.txt"
[Wed 22-06-2022 13:03:03] Line 21: Running garbage collection.
[Wed 22-06-2022 13:03:03] Line 21: Interpreting [interpretIO]: write(Lookup 'temp$4' as NGLCounts; __can_move=True; __hash="7b8566ff57fbce4d04ada723875d32b8"; ofile=BinaryOp("gmgc" -BOpPathAppend- BinaryOp(Lookup 'RESULTS' as NGLString -BOpAdd- "gmgc_norare.stats.txt")))
[Wed 22-06-2022 13:03:03] Line 21: Interpreting [write]: NGOCounts File /temp/sam_stats_block_selected_mapped_gmgc:no-rare6057-2.stats
[Wed 22-06-2022 13:03:03] Line 21: Writing counts to: gmgc/D16gmgc_norare.stats.txt
[Wed 22-06-2022 13:03:03] Line 22: Running garbage collection.
[Wed 22-06-2022 13:03:03] Line 22: Interpreting [interpretIO]: __check_ofile(BinaryOp("gmgc" -BOpPathAppend- BinaryOp(Lookup 'RESULTS' as NGLString -BOpAdd- ".gmgc_norare.bam")); original_lno=22)
[Wed 22-06-2022 13:03:03] Line 22: Interpreting [executing module function: '__check_ofile']: NGOString "gmgc/D16.gmgc_norare.bam"
[Wed 22-06-2022 13:03:03] Line 22: Running garbage collection.
[Wed 22-06-2022 13:03:03] Line 22: Interpreting [interpretIO]: write(Lookup 'gmgc_mapped_post' as NGLMappedReadSet; __can_move=True; __hash="dcb070e78c288900aed230a2e77b322c"; ofile=BinaryOp("gmgc" -BOpPathAppend- BinaryOp(Lookup 'RESULTS' as NGLString -BOpAdd- ".gmgc_norare.bam")))
[Wed 22-06-2022 13:03:03] Line 22: Interpreting [write]: NGOMappedReadSet {nglgroupName = "preprocessed/D16filtered_HG_HS_qc.pair.1.fq.gz", nglSamFile = File /temp/block_selected_mapped_gmgc:no-rare.sam6057-1.zstd, nglReference = Just "gmgc:no-rare"}
[Wed 22-06-2022 13:03:03] Line 22: Created & opened temporary file /temp/converted_block_selected_mapped_gmgc:no-rare6057-3.bam
[Wed 22-06-2022 13:03:03] Line 22: SAM->BAM Conversion start ('/block_selected_mapped_gmgc:no-rare.sam6057-1.zstd' -> '/temp/converted_block_selected_mapped_gmgc:no-rare6057-3.bam')
ngless "1.4"
import "gmgc" version "1.0"
import "parallel" version "1.0"
import "samtools" version "1.0"

samples = readlines('samplelist.txt')
current = lock1(samples)
input = paired ("preprocessed"</>current + "filtered_HG_HS_qc.pair.1.fq.gz","preprocessed"</>current + "filtered_HG_HS_qc.pair.2.fq.gz")

#different result outputdirs not necessary anymore, since all results will be "collected" in one output with the collect function
RESULTS = current


##GMGC
##"How counts are adjusted in the presence of multiple annotations is defined by the multiple argument. Generally, for obtaining gene abundances, distribution of multiple mappers is the best (using multiple={dist1}), while for functional annotations, you want to count them all (using multiple={all1}). This implies that the functional annotations will sum to a higher value than the number of reads. This may seem strange at first, but it is the intended behaviour."
gmgc_mapped = map (input, reference='gmgc:no-rare',mode_all=True,block_size_megabases=250000)
gmgc_mapped_post = select(gmgc_mapped) using |mr|:
    mr = mr.filter(min_match_size=45, min_identity_pc=95, action={drop})
    if not mr.flag({mapped}):
        discard
write(mapstats(gmgc_mapped_post),ofile="gmgc"</>RESULTS+'gmgc_norare.stats.txt')
write(gmgc_mapped_post,ofile="gmgc"</>RESULTS+'.gmgc_norare.bam')

Thanks in advance! Ulrike

uloeber avatar Jun 22 '22 17:06 uloeber