bonito icon indicating copy to clipboard operation
bonito copied to clipboard

Help train

Open simonbrd opened this issue 2 years ago • 7 comments

Hello, I have a problem regarding using bonito to build my own methylation detection model on microalgae data. I do the following workflow :

$ git clone https://github.com/nanoporetech/bonito.git  # or fork first and clone that
$ cd bonito
$ python3 -m venv venv3
$ source venv3/bin/activate
(venv3) $ pip install --upgrade pip
(venv3) $ pip install -r requirements.txt
(venv3) $ python setup.py develop

bonito basecaller [email protected] --reference ../PrymneGenomeV1.fasta ../fast5/ --cevice cuda:0 > basecalls.bam

image

thank you in advance For info here is my configuration : image

simonbrd avatar Mar 17 '22 13:03 simonbrd

@simonbrd you have set --device cpu not the GPU in your screenshot.

iiSeymour avatar Mar 17 '22 14:03 iiSeymour

Yes it's true sorry... But I still have an error with the GPUs

bonito cuda

simonbrd avatar Mar 17 '22 14:03 simonbrd

The default batch size for the fast model is 1536 and uses about ~10GB of GPU memory so try reducing the batchsize to fit your GPU memory capacity bonito basecaller [email protected] --batchsize 512 ....

iiSeymour avatar Mar 17 '22 14:03 iiSeymour

thank you very much it works !

simonbrd avatar Mar 17 '22 15:03 simonbrd

Hello, I have a new problem can you help me? train_bonito

for info i had this before : bonito basecaller [email protected] --batchsize 200 --reference ../PrymneGenomeV1.fasta ../fast5/ --device cuda:0 > data/train/basecalls_sans_ctc.bam

simonbrd avatar Mar 21 '22 09:03 simonbrd

@simonbrd to save data in a training format you need to add --save-ctc when basecalling, the error message you are getting is because not training data is present in data/train.

iiSeymour avatar Mar 22 '22 15:03 iiSeymour

Ok thank you. But I don't understand because in the basecaller results I only have 2 files but no training data for the bonito train ? bonito res restrain

simonbrd avatar Mar 23 '22 15:03 simonbrd