How many threads and memories required at training stage?
Thank you for your development. I am using Nanosim to simulate ONT data, I use 32 threads and 256GB memory to run training stage, but it reported out of memory error. The command is
read_analysis.py genome \
-i ZJYY_ont_filter.fq.gz \
-rg nd.asm.fasta \
-o ${home_dir}/01-data/ONT/${species}_training \
--fastq \
-t 32
The ZJYY_ont_filter.fq.gz dataset stat is
file format type num_seqs sum_len min_len avg_len max_len
ZJYY_ont_filter.fq.gz FASTQ DNA 1,544,988 43,308,647,713 2,000 28,031.7 246,468
And when I run the command without --fastq parameter, the training step could be finished.
Hi @yaoxkkkkk,
The amount of memory required will really depend on the dataset that you are training on.
On my end, training using --fastq with the HG002 ONT dataset used for the latest pre-trained models required around 263 GB of RAM - so that could be why you are seeing those errors.
If you want to use --fastq, some other options could be to use our pre-trained model, or try training using a subset of your reads.
Thank you for your interest in NanoSim! Lauren