bonito icon indicating copy to clipboard operation
bonito copied to clipboard

getting pre trained models

Open colindaven opened this issue 4 years ago • 7 comments

If the models download fails is there a wget or similar command to work around this ?

If this directory only contains configs then please run bonito download --models.

Thanks

colindaven avatar Nov 16 '20 15:11 colindaven

Be good to know more about why bonito download --models is failing but sure, the URLs are all here.

$ wget -q https://nanoporetech.box.com/shared/static/uetgwsnb8yfqvuyoka8p09mxilgskqc7.zip # [email protected]
$ unzip uetgwsnb8yfqvuyoka8p09mxilgskqc7.zip
Archive:  uetgwsnb8yfqvuyoka8p09mxilgskqc7.zip
   creating: [email protected]/
  inflating: [email protected]/config.toml  
  inflating: [email protected]/weights_1.tar  
$ ls -1 [email protected]/
config.toml
weights_1.tar

If you don't store the models under the bonito models directory then make sure to use the full path when basecalling, i.e -

$ bonito basecaller /tmp/[email protected] reads > out.fasta

iiSeymour avatar Nov 16 '20 16:11 iiSeymour

Same issue as above.

I run: bonito download --models And that seems to run fine:

[downloading models] [downloaded [email protected]]
[downloaded [email protected]]
[downloaded [email protected]]
[downloaded [email protected]]
[skipping dna_r9.4.1.zip]

But when I try to run something with the models: bonito evaluate dna_r9.4.1

  • loading data Traceback (most recent call last): File "/home/miniconda3/bin/bonito", line 11, in load_entry_point('ont-bonito==0.3.1', 'console_scripts', 'bonito')() File "/home/miniconda3/lib/python3.8/site-packages/bonito/init.py", line 39, in main args.func(args) File "/home/miniconda3/lib/python3.8/site-packages/bonito/cli/evaluate.py", line 25, in main *load_data( File "/home/miniconda3/lib/python3.8/site-packages/bonito/util.py", line 223, in load_data chunks = np.load(os.path.join(directory, "chunks.npy")) File "/home/miniconda3/lib/python3.8/site-packages/numpy/lib/npyio.py", line 428, in load fid = open(os_fspath(file), "rb") FileNotFoundError: [Errno 2] No such file or directory: '/home/miniconda3/lib/python3.8/site-packages/bonito/data/dna_r9.4.1/chunks.npy'`

pwh124 avatar Nov 17 '20 03:11 pwh124

Hi @pwh124 , I think the "skipping" output is the problem but aren't sure why it is occurring. You can try

bonito download --models -f

My output: ( will try manual download next, thanks @iiSeymour ).

..... /bonito/bonito/build/lib/bonito/models$ cat README.md
# Bonito Pre-trained Models

If this directory only contains `configs` then please run `bonito download --models`.
rcug@hpc-rc09:/mnt/ngsnfs/tools/bonito/bonito/build/lib/bonito/models$ bonito download --models
[downloading models]
[downloaded [email protected]]
[downloaded [email protected]]
[downloaded [email protected]]
[skipping dna_r9.4.1.zip]
...../bonito/bonito/build/lib/bonito/models$ ls
configs  README.md
..../bonito/bonito/build/lib/bonito/models$ bonito download --models -f
[downloading models]
[downloaded [email protected]]
[downloaded [email protected]]
[downloaded [email protected]]
[downloaded dna_r9.4.1.zip]
....../bonito/bonito/build/lib/bonito/models$ ll
total 16
drwxrwxr-x 3 rcug rcug 4096 Oct 28 17:07 ./
drwxrwxr-x 7 rcug rcug 4096 Oct 28 17:07 ../
drwxrwxr-x 2 rcug rcug 4096 Oct 28 17:07 configs/
-rw-rw-r-- 1 rcug rcug  115 Oct 28 17:00 README.md

colindaven avatar Nov 17 '20 09:11 colindaven

@colindaven are you working under a virtualenv? Can you check which installation your bonito command is associated with by running which bonito. If you ran python setup.py develop under /mnt/ngsnfs/tools/bonito/ I would expect to find your models in /mnt/ngsnfs/tools/bonito/bonito/models (not in the build dir). Also, note your checkout appears to be at v0.3.0, not v0.3.1, so you might want to git pull and run download again to get the latest model.

The download command is successful so you should be able to basecall just fine.

@pwh124 you will want to run bonito download --models -f to make sure dna_r9.4.1 is updated to the @v3.1 model otherwise it will point at [email protected]. The issue you are seeing with bonito evaluate dna_r9.4.1 is because you haven't downloaded a copy of the training data (bonito download --training). which bonito evaluate uses, you should also be able to basecall your own reads just fine.

iiSeymour avatar Nov 17 '20 10:11 iiSeymour

It seems like the models for me are downloading to this path where bonito_env is where I set up my virtual env:

bonito_env/lib/python3.6/site-packages/bonito/models

Hope that helps!

wharvey31 avatar Apr 21 '21 23:04 wharvey31

@iiSeymour About the pre-trained models, are these trained only on the 66k reads that are also provided in this repo?

marcpaga avatar Jun 24 '21 13:06 marcpaga

@marcpaga no, the 66K R9.4.1 reads are mainly provided as a baseline for anybody interested in method development.

iiSeymour avatar Jun 24 '21 13:06 iiSeymour