blog icon indicating copy to clipboard operation
blog copied to clipboard

fails on 'processor = Wav2Vec2ProcessorWithLM.from_pretrained("patrickvonplaten/wav2vec2-base-100h-with-lm")'

Open TzurV opened this issue 3 years ago • 6 comments

the following step in 'Boosting Wav2Vec2 with n-grams in 🤗 Transformers' colab example fails

from transformers import Wav2Vec2ProcessorWithLM
processor = Wav2Vec2ProcessorWithLM.from_pretrained("patrickvonplaten/wav2vec2-base-100h-with-lm")

with the following error:

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-20-f5f27bdf189a> in <module>()
      1 from transformers import Wav2Vec2ProcessorWithLM
      2 
----> 3 processor = Wav2Vec2ProcessorWithLM.from_pretrained("patrickvonplaten/wav2vec2-base-100h-with-lm")

1 frames
/usr/local/lib/python3.7/dist-packages/transformers/file_utils.py in requires_backends(obj, backends)
    820     name = obj.__name__ if hasattr(obj, "__name__") else obj.__class__.__name__
    821     if not all(BACKENDS_MAPPING[backend][0]() for backend in backends):
--> 822         raise ImportError("".join([BACKENDS_MAPPING[backend][1].format(name) for backend in backends]))
    823 
    824 

ImportError: 
Wav2Vec2ProcessorWithLM requires the pyctcdecode library but it was not found in your environment. You can install it with pip:
`pip install pyctcdecode`


---------------------------------------------------------------------------
NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.

To view examples of installing some common dependencies, click the
"Open Examples" button below.
---------------------------------------------------------------------------

Although pyctcdecode and kenlm installation is successful.

TzurV avatar Jan 24 '22 18:01 TzurV

I can verify this issue exists. Throws missing library despite library being installed. Any update here? Thanks!

malcolmmurdock avatar Apr 11 '22 17:04 malcolmmurdock

cc @patrickvonplaten

osanseviero avatar Apr 11 '22 20:04 osanseviero

After playing around, I've found that in the notebook if you run code line 9 before you ever run code line 4, you don't get the error.

i.e. if you run !pip install https://github.com/kpu/kenlm/archive/master.zip pyctcdecode at the top of the notebook instead of where it's currently run in the tutorial, you can import Wav2Vec2ProcessorWithLM without issue.

My guess is in the current state of the notebook, import_utils is getting loaded the first time you hit Transformers (on notebook line 4) and the library install bools don't get refreshed after installing pyctcdecode further along in the notebook.

Not sure therefore if this should be categorized as a true bug or if the location of the !pip install should simply be moved to the top in the "Boosting Wav2Vec2 with n-grams in 🤗 Transformers" colab...

Edit: caveat that I'm not positive the issue is import_utils getting loaded early then not refreshed, but it seems logical since if you manually run the library verification code in import_utils after installing pyctcdecode you don't get the error...

malcolmmurdock avatar Apr 11 '22 20:04 malcolmmurdock

Looking into it

patrickvonplaten avatar Apr 13 '22 10:04 patrickvonplaten

Thanks a lot for the hint @captainnurple!

I corrected the google colab, should work now :slightly_smiling_face:

patrickvonplaten avatar Apr 18 '22 14:04 patrickvonplaten

Thanks @patrickvonplaten! I was still getting errors so I just submitted a PR to the notebook that seems to fix them. Have a look if it's helpful!

malcolmmurdock avatar Apr 26 '22 23:04 malcolmmurdock