Genomic-ULMFiT icon indicating copy to clipboard operation
Genomic-ULMFiT copied to clipboard

Getting an NameError: name 'BaseTokenizer' is not defined!

Open schlogl2017 opened this issue 3 years ago • 7 comments

I can't running your script because the utils is giving this error in jupyter notebook! Any tip for make it work?

Thank you

NameError                                 Traceback (most recent call last)
<ipython-input-7-70b698022c71> in <module>
     19     return (df_t, df_v)
     20 
---> 21 class GenomicTokenizer(BaseTokenizer):
     22     def __init__(self, lang='en', ngram=5, stride=2):
     23         self.lang = lang

NameError: name 'BaseTokenizer' is not defined

schlogl2017 avatar Mar 19 '21 20:03 schlogl2017

I change the imports to: from fastai.text.all import *. And now the error change to:

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-9-70b698022c71> in <module>
     39         pass
     40 
---> 41 class GenomicVocab(Vocab):
     42     def __init__(self, itos):
     43         self.itos = itos

NameError: name 'Vocab' is not defined

schlogl2017 avatar Mar 19 '21 20:03 schlogl2017

I am facing these exact same issues. Could you find any solution? Did it work for you?

sourajyoti-datta avatar Mar 22 '21 10:03 sourajyoti-datta

No sir! If i got something I will update you! Be save

schlogl2017 avatar Mar 22 '21 13:03 schlogl2017

I am facing these exact same issues. Could you find any solution? Did it work for you?

I got this maybe could help

I'm running it without issues in colab, just start your notebook with: !pip3 install fastai==1.0.61 !pip install biopython clone the repo and you ready to go!

also... import sys sys.path.append('path to cloned repo')

schlogl2017 avatar Mar 22 '21 13:03 schlogl2017

Thanks. This issue seems resolved. I hope the creator also adds a requirements.txt file to the repository, that would be complete.

sourajyoti-datta avatar Mar 22 '21 13:03 sourajyoti-datta

While training the GLM language model, I am getting memory error. My system has 32 gigs of RAM. Are you facing any such issues?

I can first run this: Human Genome LM 0 Data Processing https://github.com/kheyer/Genomic-ULMFiT/blob/master/Mammals/Human/Genomic%20Language%20Models/Human%20Genome%20LM%200%20Data%20Processing.ipynb

But MEMORY ERROR in this: Human Genome LM 5 3-mer Stride 1 Language Model https://github.com/kheyer/Genomic-ULMFiT/blob/master/Mammals/Human/Genomic%20Language%20Models/Human%20Genome%20LM%205%203-mer%20Stride%201%20Language%20Model.ipynb

sourajyoti-datta avatar Mar 22 '21 23:03 sourajyoti-datta

I'm having the same problem. Have you guys run it through successfully? If so, can you give me the versions of the various packages?

tzhu-bio avatar Feb 09 '23 12:02 tzhu-bio