bonito
bonito copied to clipboard
Help train
Hello, I have a problem regarding using bonito to build my own methylation detection model on microalgae data. I do the following workflow :
$ git clone https://github.com/nanoporetech/bonito.git # or fork first and clone that
$ cd bonito
$ python3 -m venv venv3
$ source venv3/bin/activate
(venv3) $ pip install --upgrade pip
(venv3) $ pip install -r requirements.txt
(venv3) $ python setup.py develop
bonito basecaller [email protected] --reference ../PrymneGenomeV1.fasta ../fast5/ --cevice cuda:0 > basecalls.bam
thank you in advance For info here is my configuration :
@simonbrd you have set --device cpu
not the GPU in your screenshot.
Yes it's true sorry... But I still have an error with the GPUs
The default batch size for the fast model is 1536 and uses about ~10GB of GPU memory so try reducing the batchsize to fit your GPU memory capacity bonito basecaller [email protected] --batchsize 512 ...
.
thank you very much it works !
Hello, I have a new problem can you help me?
for info i had this before :
bonito basecaller [email protected] --batchsize 200 --reference ../PrymneGenomeV1.fasta ../fast5/ --device cuda:0 > data/train/basecalls_sans_ctc.bam
@simonbrd to save data in a training format you need to add --save-ctc
when basecalling, the error message you are getting is because not training data is present in data/train
.
Ok thank you. But I don't understand because in the basecaller results I only have 2 files but no training data for the bonito train ?