DiTing
DiTing copied to clipboard
DiTing bucket id error
Hi there, I have been attempting to run DiTing on some metatranscriptome and metagenome datasets, however I am getting an error I do not understand.
I have installed DiTing and the relevant dependancies within a conda environment, using the following commands conda activate dting-env /data/SBCS-EyiceLab/SP/DiTing/diting.py -r /data/SBCS-EyiceLab/SP/Baltic_Metatranscriptome/Raw -o Run01 -m 128 -n 8
I have attatched the log files (see below) but I believe the line where it fails is " b'FATAL sequence/io/edge/edge_io_meta.h: 62 - Invalid format: bucket id not matched!'" I do not know how to interpret this, can anyone advise as to how I can resolve this issue?
Thank you for any help you can give!!
Hi @SamPrudence , I'm not very sure what happened here, but it looks like there is something wrong with the IDs of your fastq reads. Can you assemble the reads with a standalone megahit? If you can do that, Diting could make use of your assembled contigs to do the rest of the analysis.
Hi @SilentGene thanks for your help. I have tried to run megahit standalone and it still doesn't work, so I assume there must be an issue with my data formatting?
Hi Just coming back with a quick update, I managed to get megahit running on my files (I wasn't running megahit for interleaved files), now attempting to run DiTing on the assembled contigs! I will report back if it works
Hi, another quick update. I attempted DiTing on the megahit assembled contigs, however I am now getting a different error message "AttributeError: 'NoneType' object has no attribute 'groups'" (see error file below). Do you know how I might get around this?
Thanks again for all your help
hi @SamPrudence Can I know what the file names of your reads are?
Hi @SilentGene, I am running just one file here (I am still running megahit on the rest) but the file name is "final-contigs.fa"
hi Sam, have you finally figured out what happened there?
No matter how many samples you are running, you always need the corresponding read files (fastq files). Let's say your assembly is named 'final-contigs.fa', you should make sure your reads are named accordingly, e.g. they could be 'final-contigs_1.fq' and 'final-contigs_2.fq'.
Hi @SilentGene, sorry just getting back to this. Thanks for clarifying that, I have now set up another run providing both files as you say and that is running as we speak, I will let you know if I have any further issues!
Hi @SilentGene, I have been playing around with this for a while, providing both files, and I still can't get it to run. When I provide the file path + the file I get a "not a file path" error. However, when I just provide the file path it says "Cannot find the corresponding reads for final-contigs". its not working when I provide either a fastq or a .gz file for the raw reads file, any idea what I am doing wrong here?
hi @SamPrudence please provide the command line that you run and also the file and their directories that you have, so I can see what happened there, thanks!
Hi @SilentGene sure no problem, thank you!
When I get the "Cannot find the corresponding reads for final-contigs" error I used the code below. The versions which yield the "not a directory" errors also have the specific file names (H2D5S.fastq.gz and final-contigs.fa).
conda activate dting-env diting.py -r Baltic_Metatranscriptome/Raw/H2D5S -a megahit-H2D5S/final-contigs -o diting-H2D5S