cdhit
cdhit copied to clipboard
problem with maximum allowed sequences lengths
Hello I previously ran cd-hit on a HPC to dereplicate some contigs with no problem
cd-hit-est -i 10k_seqs.fasta -o cdhit_outdir/contigs_cd-hit -c 0.95 -n 8,9 -G 0 -aS 0.8 -g 1 -d 0 -M 100000 -T 16
on that run the maximum contig length was of 400k.
with a second set of contigs on the same HPC I ran cd-hit-est
with the same parameters but I got first a warning message from cds-hit-est, then slurm canceled the job:
Warning: Some seqs are too long, please rebuild the program with make parameter MAX_SEQ=new-maximum-length (e.g. make MAX_SEQ=10000000) Not fatal, but may affect results !! slurm/var/spool/job2125359/slurm_script: line 16: 28678 Segmentation fault cd-hit-est -i ${dir}/10k_seqs.fasta -o ${dir}/cdhit_outdir/contigs_cd-hit -c 0.95 -n 8,9 -G 0 -aS 0.8 -g 1 -d 0 -M 100000 -T 16
The warning message is pointing to rebuild the program to allow longer sequences but I have no permissions to rebuild the program (and asking it may take too long). is there a way in which I can change the maximum allowed sequence length with out rebuilding the program or via conda installation ?
best regards,
Valentín.