issue with parsnp/1.5.4
Hi, I'm currently trying to use parsnp v1.5.4 on a slurm cluster (with raxml v8.2.12, PhiPack/1.0, harvest-tools/1.3, FastTree/2.1.11) and I keep getting an error when trying to run parsnp on my 13 bacterial genomes. Any idea on how to fix this? here is wath I have on my terminal window:
srun -c 4 parsnp -d test_parsnp/ -r! -o test_parsnp/output_parsnp -v -c -p 32 -P 128000 |--Parsnp 1.5.4--| For detailed documentation please see --> http://harvest.readthedocs.org/en/latest 13:56:05 - INFO -
SETTINGS: |-refgenome: autopick |-genomes: test_parsnp/EERA844_lgtfilt.fasta test_parsnp/Ec046_lgtfilt.fasta ...12 more file(s)... test_parsnp/EERA890_lgtfilt.fasta test_parsnp/Ec125_lgtfilt.fasta |-aligner: muscle |-outdir: test_parsnp/output_parsnp |-OS: Linux |-threads: 32
13:56:05 - INFO - <<Parsnp started>> 13:56:05 - INFO - No genbank file provided for reference annotations, skipping.. 13:56:05 - DEBUG - Sorting reference replicons 13:56:05 - DEBUG - Writing .ini file 13:56:05 - INFO - Running Parsnp multi-MUM search and libMUSCLE aligner... 13:56:05 - DEBUG - /opt/gensoft/exe/parsnp/1.5.4/bin/parsnp_core test_parsnp/output_parsnp/parsnpAligner.ini 14:03:22 - CRITICAL - The following command failed: >>$ /opt/gensoft/exe/parsnp/1.5.4/bin/parsnp_core test_parsnp/output_parsnp/parsnpAligner.ini Please veryify input data and restart Parsnp. If the problem persists please contact the Parsnp development team.
STDOUT:
0
Ec046_lgtfilt.fasta.ref,Len:5089403,GC:50.7909 ... Finished processing input sequences, elapsed time: 3 seconds
compressed suffix graph construction elapsed time: 0 seconds
MUM anchor search elapsed time: 10 seconds
compressed suffix graph construction elapsed time: 0 seconds
... Finished recursive MUM search, elapsed time: 1 seconds
Finished filtering spurious matches, elapsed time: 0 seconds
LCBs created, elapsed time: 0 seconds
STDERR:
parsnpAligner:: rapid whole genome SNP typing
ParSNP: Preparing to construct global multiple alignment framework
Preparing to verify and process input sequences... Searching for initial MUM anchors...
Constructing compressed suffix graph...
Performing initial search for exact matches in the sequences...
... Performing recursive MUM search between MUM anchors... Filtering spurious matches... Creating and verifying final LCBs... Writing output files & aligning LCBs...
*** ERROR *** TreeFromSeqVect_UPGMA, CLUSTER_6 not supported
*** ERROR *** TreeFromSeqVect_UPGMA, CLUSTER_6 not supported
*** ERROR *** TreeFromSeqVect_UPGMA, CLUSTER_6 not supported
srun: error: task 0: Exited with exit code 2
Hello, @lanorvege I have the same issue. Have you solved this issue? Valery Udp the issue were solved by changed -p to much more lower value then the node has. the node has 94 cpus, I've set 30 cpus for parsnp not all
@valery-shap to clarify, you were observing the
*** ERROR *** TreeFromSeqVect_UPGMA, CLUSTER_6 not supported
issue as well, but resolved it by lowering the number of threads used?
No, I don't have this issue now , it successfully ended. seems that all (except one) my problems were because of this. errors:
- reference with 5 contigs. 1 chromosome and 4 plasmids. I got:
Traceback (most recent call last):
File "../bin/parsnp", line 1328, in <module>
if block_spos < chr_spos:
TypeError: '<' not supported between instances of 'int' and 'list'
I changed the reference to one contig with only chromosome and got the other error:
mkdir: cannot create directory ‘../blocks/’: File exists 10 seqs, max length 59, avg length 59
*** WARNING *** Assuming DNA (see -seqtype option), invalid letters found:
*** WARNING *** Assuming DNA (see -seqtype option), invalid letters found:
*** WARNING *** Assuming DNA (see -seqtype option), invalid letters found:
*** WARNING *** Assuming DNA (see -seqtype option), invalid letters found:
*** WARNING *** Assuming DNA (see -seqtype option), invalid letters found:
and like this
mkdir: cannot create directory /blocks/’: File exists
10 seqs, max length 127, avg length 127
*** WARNING *** Assuming DNA (see -seqtype option), invalid letters found:
*** WARNING *** Assuming DNA (see -seqtype option), invalid letters found:
*** buffer overflow detected ***: /miniconda3/envs/parsnp/bin/bin/parsnp_core terminated
and have variants with cluster too
mkdir: cannot create directory ‘/blocks/’: File exists
10 seqs, max length 59, avg length 59
*** WARNING *** Assuming DNA (see -seqtype option), invalid letters found:
*** WARNING *** Assuming DNA (see -seqtype option), invalid letters found:
*** WARNING *** Assuming DNA (see -seqtype option), invalid letters found:
*** WARNING *** Assuming DNA (see -seqtype option), invalid letters found:
*** WARNING *** Assuming DNA (see -seqtype option), invalid letters found:
Alignment not completed, cannot save.
*** ERROR *** TreeFromSeqVect_UPGMA, CLUSTER_6 not supported
*** WARNING *** Assuming DNA (see -seqtype option), invalid letters found:
*** WARNING *** Assuming DNA (see -seqtype option), invalid letters found:
*** buffer overflow detected ***: /home/miniconda3/envs/parsnp/bin/bin/parsnp_core terminated
*** WARNING *** Assuming DNA (see -seqtype option), invalid letters found:
*** WARNING *** Assuming DNA (see -seqtype option), invalid letters found:
I've found the files with K, M, Y symbols and removed them and I still have this error
then I've done test dir with ideal 5 genomes and run parsnp without slurm on another server but just in terminal, I've set 30 cpus
and it worked!
then I've added the genome with N symbol and it worked too
then I've set 70 cpus (all cpus of this server) and I've got the error about:
*** ERROR *** TreeFromSeqVect_UPGMA, CLUSTER_6 not supported
and I've seen the same error on the slurm server when I've set all cpus of the node -p.
Now I've the final output folder with all output files: changed reference with one chromosome and including all genomes(with K,Y, W symbols), I see them in log file of parsnp (about Len and gc) but couldn't find on the tree. It is ALL looks very strange. I've used parsnp nearly year ago with hundred plasmids (only sequences of special plasmid) on the laptop and it worked excellent!
In the beginning I had this issue on slurm server with 750 gb ram too:
*** MAX MEMORY 4 MB EXCEEDED***
Memory allocated so far 16004 MB, physical RAM 680 MB
Use -maxmb <n> option to increase limit, where <n> is in MB.
There is no such flag. The command all the time was: parsnp -r ref.fasta -d genomes_dir -o output_dir -c -x -p different values tried to set -P too, but finally it worked without it version of parsnp from conda 1.5.6
Valery