fcs icon indicating copy to clipboard operation
fcs copied to clipboard

[BUG]: permanentFail during job blast

Open egonozer opened this issue 1 year ago • 4 comments

Describe the bug I am running fcs-adaptor.sh v0.4.0 using Singularity on several genome assemblies generated using SPAdes v3.15.5. Nearly all of them run without error, but I consistently get a permanentFail error for some right after the [job blast] $ vecscreen \ ... command is run. I was getting the same error with fcs-adaptor v0.2.3, but upgrading to v0.4.0 didn't help. When I modified the fcs-adaptor.sh script to include the --debug statement, the last directory in the debug folder named contains the vecscreen.log file which has the following error message:

91501/000/0000/P  CF98656D495AB571 0003/0003 2023-06-23T09:25:28.132666 quser31         UNK_CLIENT      UNK_SESSION              vecscreen Error: VECSCREEN "vecscreen_app.cpp", line 149: CVecScreenApp::Run() --- Blast Error: Near line 0, the local id is too long.  Its length is 51 but the maximum allowed local id length is 50.  Please find and correct all local ids that are too long.

Looking at the file in the debug folder that was used as input for vecscreen, split_fasta.fna, here are the first few sequence headers:

>lcl|NODE_1_length_1090171_cov_39.792984__1_500000
>lcl|NODE_1_length_1090171_cov_39.792984__499501_999500
>lcl|NODE_1_length_1090171_cov_39.792984__999001_1090171
>lcl|NODE_2_length_887790_cov_41.364514__1_500000
>lcl|NODE_2_length_887790_cov_41.364514__499501_887790
>lcl|NODE_3_length_557874_cov_41.529717__1_500000
>lcl|NODE_3_length_557874_cov_41.529717__499501_557874
>NODE_4_length_463023_cov_41.874365
>NODE_5_length_417001_cov_40.711275
>NODE_6_length_369796_cov_39.096108

So it looks like the output of the fasta_split application is generating some sequence IDs that are too long for vecscreen. Is this something that can be modified in fcs-adapator? I really don't want to have to take the step of renaming all my input files with shorter sequence identifiers before running them through fcs_adaptor.

Thanks!

To Reproduce ./run_fcsadaptor.sh --fasta-input ../spades/1148/contigs.fasta --output-dir test --prok --container-engine singularity --image fcsadaptor_0.4.0/fcs-adaptor.sif

Software versions (please complete the following information):

  • RHEL 7.9 (Maipo)
  • Singularity version 3.8.1
  • Docker or Singularity FCS image version:
$ singularity inspect fcsadaptor_0.4.0/fcs-adaptor.sif
org.label-schema.build-date: Friday_18_November_2022_19:10:40_UTC
org.label-schema.schema-version: 1.0
org.label-schema.usage.singularity.deffile.bootstrap: docker
org.label-schema.usage.singularity.deffile.from: us-east4-docker.pkg.dev/ncbi-seqplus-rodr-build-res/ncbi-cgr/fcs/av_screen_x:develop-latest
org.label-schema.usage.singularity.version: 3.4.0-1

Log Files Attached debug.y6aphyfv.tar.gz

egonozer avatar Jun 23 '23 14:06 egonozer