WhisperHallu icon indicating copy to clipboard operation
WhisperHallu copied to clipboard

Can't load Whisper Model

Open renatobrusarosco opened this issue 1 year ago • 4 comments

Hi there,

I don't have a lot of knowledge about programming, and I'm struggling to deal with your tool. I tried to use the tool in Jupyter Anaconda. I installed all the required libraries, but when I attempted to run the final code, I encountered the same error message multiple times:

vbnet Copy code Python >= 3.10 Using cache found in C:\Users\renat/.cache\torch\hub\snakers4_silero-vad_master Using Demucs Using standard Whisper LOADING: medium GPU:0 BS: 2 100%|█████████████████████████████████████| 1.42G/1.42G [00:25<00:00, 59.9MiB/s] Can't load Whisper model: STD/medium

Below you can find more information:

RuntimeError Traceback (most recent call last) File ~\Downloads\Jupyter\WhisperHallu\transcribeHallu.py:110, in loadModel(gpu, modelSize) 109 print("LOADING: "+modelSize+" GPU:"+gpu+" BS: "+str(beam_size)) --> 110 model = whisper.load_model(modelSize,device=torch.device("cuda:"+gpu)) #May be "cpu" 111 elif whisperFound == "SM4T":

File ~\AppData\Roaming\Python\Python311\site-packages\whisper_init_.py:146, in load_model(name, device, download_root, in_memory) 143 with ( 144 io.BytesIO(checkpoint_file) if in_memory else open(checkpoint_file, "rb") 145 ) as fp: --> 146 checkpoint = torch.load(fp, map_location=device) 147 del checkpoint_file

File ~\AppData\Roaming\Python\Python311\site-packages\torch\serialization.py:1014, in load(f, map_location, pickle_module, weights_only, mmap, **pickle_load_args) 1013 raise pickle.UnpicklingError(UNSAFE_MESSAGE + str(e)) from None -> 1014 return _load(opened_zipfile, 1015 map_location, 1016 pickle_module, 1017 overall_storage=overall_storage, 1018 **pickle_load_args) 1019 if mmap:

File ~\AppData\Roaming\Python\Python311\site-packages\torch\serialization.py:1422, in _load(zip_file, map_location, pickle_module, pickle_file, overall_storage, **pickle_load_args) 1421 unpickler.persistent_load = persistent_load -> 1422 result = unpickler.load() 1424 torch._utils._validate_loaded_sparse_tensors()

File ~\AppData\Roaming\Python\Python311\site-packages\torch\serialization.py:1392, in _load..persistent_load(saved_id) 1391 nbytes = numel * torch._utils._element_size(dtype) -> 1392 typed_storage = load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location)) 1394 return typed_storage

File ~\AppData\Roaming\Python\Python311\site-packages\torch\serialization.py:1366, in _load..load_tensor(dtype, numel, key, location) 1363 # TODO: Once we decide to break serialization FC, we can 1364 # stop wrapping with TypedStorage 1365 typed_storage = torch.storage.TypedStorage( -> 1366 wrap_storage=restore_location(storage, location), 1367 dtype=dtype, 1368 _internal=True) 1370 if typed_storage._data_ptr() != 0:

File ~\AppData\Roaming\Python\Python311\site-packages\torch\serialization.py:1299, in _get_restore_location..restore_location(storage, location) 1298 def restore_location(storage, location): -> 1299 return default_restore_location(storage, str(map_location))

File ~\AppData\Roaming\Python\Python311\site-packages\torch\serialization.py:381, in default_restore_location(storage, location) 380 for _, _, fn in _package_registry: --> 381 result = fn(storage, location) 382 if result is not None:

File ~\AppData\Roaming\Python\Python311\site-packages\torch\serialization.py:274, in _cuda_deserialize(obj, location) 273 if location.startswith('cuda'): --> 274 device = validate_cuda_device(location) 275 if getattr(obj, "_torch_load_uninitialized", False):

File ~\AppData\Roaming\Python\Python311\site-packages\torch\serialization.py:258, in validate_cuda_device(location) 257 if not torch.cuda.is_available(): --> 258 raise RuntimeError('Attempting to deserialize object on a CUDA ' 259 'device but torch.cuda.is_available() is False. ' 260 'If you are running on a CPU-only machine, ' 261 'please use torch.load with map_location=torch.device('cpu') ' 262 'to map your storages to the CPU.') 263 device_count = torch.cuda.device_count()

RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

During handling of the above exception, another exception occurred:

SystemExit Traceback (most recent call last) [... skipping hidden 1 frame]

Cell In[3], line 26 15 #Example 16 #lng="uk" 17 #prompt= "Whisper, Ok. "
(...) 23 # +"Ok, Whisper. " 24 #path="/path/to/your/uk/sound/file" ---> 26 loadModel("0") 27 result = transcribePrompt(path=path, lng=lng, prompt=prompt)

File ~\Downloads\Jupyter\WhisperHallu\transcribeHallu.py:117, in loadModel(gpu, modelSize) 116 print("Can't load Whisper model: "+whisperFound+"/"+modelSize) --> 117 sys.exit(-1)

SystemExit: -1

During handling of the above exception, another exception occurred:

AttributeError Traceback (most recent call last) [... skipping hidden 1 frame]

File ~\anaconda3\Lib\site-packages\IPython\core\interactiveshell.py:2097, in InteractiveShell.showtraceback(self, exc_tuple, filename, tb_offset, exception_only, running_compiled_code) 2094 if exception_only: 2095 stb = ['An exception has occurred, use %tb to see ' 2096 'the full traceback.\n'] -> 2097 stb.extend(self.InteractiveTB.get_exception_only(etype, 2098 value)) 2099 else: 2101 def contains_exceptiongroup(val):

File ~\anaconda3\Lib\site-packages\IPython\core\ultratb.py:710, in ListTB.get_exception_only(self, etype, value) 702 def get_exception_only(self, etype, value): 703 """Only print the exception type and message, without a traceback. 704 705 Parameters (...) 708 value : exception value 709 """ --> 710 return ListTB.structured_traceback(self, etype, value)

File ~\anaconda3\Lib\site-packages\IPython\core\ultratb.py:568, in ListTB.structured_traceback(self, etype, evalue, etb, tb_offset, context) 565 chained_exc_ids.add(id(exception[1])) 566 chained_exceptions_tb_offset = 0 567 out_list = ( --> 568 self.structured_traceback( 569 etype, 570 evalue, 571 (etb, chained_exc_ids), # type: ignore 572 chained_exceptions_tb_offset, 573 context, 574 ) 575 + chained_exception_message 576 + out_list) 578 return out_list

File ~\anaconda3\Lib\site-packages\IPython\core\ultratb.py:1435, in AutoFormattedTB.structured_traceback(self, etype, evalue, etb, tb_offset, number_of_lines_of_context) 1433 else: 1434 self.tb = etb -> 1435 return FormattedTB.structured_traceback( 1436 self, etype, evalue, etb, tb_offset, number_of_lines_of_context 1437 )

File ~\anaconda3\Lib\site-packages\IPython\core\ultratb.py:1326, in FormattedTB.structured_traceback(self, etype, value, tb, tb_offset, number_of_lines_of_context) 1323 mode = self.mode 1324 if mode in self.verbose_modes: 1325 # Verbose modes need a full traceback -> 1326 return VerboseTB.structured_traceback( 1327 self, etype, value, tb, tb_offset, number_of_lines_of_context 1328 ) 1329 elif mode == 'Minimal': 1330 return ListTB.get_exception_only(self, etype, value)

File ~\anaconda3\Lib\site-packages\IPython\core\ultratb.py:1173, in VerboseTB.structured_traceback(self, etype, evalue, etb, tb_offset, number_of_lines_of_context) 1164 def structured_traceback( 1165 self, 1166 etype: type, (...) 1170 number_of_lines_of_context: int = 5, 1171 ): 1172 """Return a nice text document describing the traceback.""" -> 1173 formatted_exception = self.format_exception_as_a_whole(etype, evalue, etb, number_of_lines_of_context, 1174 tb_offset) 1176 colors = self.Colors # just a shorthand + quicker name lookup 1177 colorsnormal = colors.Normal # used a lot

File ~\anaconda3\Lib\site-packages\IPython\core\ultratb.py:1063, in VerboseTB.format_exception_as_a_whole(self, etype, evalue, etb, number_of_lines_of_context, tb_offset) 1060 assert isinstance(tb_offset, int) 1061 head = self.prepare_header(str(etype), self.long_header) 1062 records = ( -> 1063 self.get_records(etb, number_of_lines_of_context, tb_offset) if etb else [] 1064 ) 1066 frames = [] 1067 skipped = 0

File ~\anaconda3\Lib\site-packages\IPython\core\ultratb.py:1131, in VerboseTB.get_records(self, etb, number_of_lines_of_context, tb_offset) 1129 while cf is not None: 1130 try: -> 1131 mod = inspect.getmodule(cf.tb_frame) 1132 if mod is not None: 1133 mod_name = mod.name

AttributeError: 'tuple' object has no attribute 'tb_frame'

In my research, I discovered that the issue is related to something with the GPU. Here are my PC specifications: AMD Ryzen 7 6800H with Radeon Graphics 3.20 GHz 16.0 GB RAM NVIDIA GeForce RTX 3070 Ti Thank you in advance.

renatobrusarosco avatar Dec 09 '23 17:12 renatobrusarosco

@renatobrusarosco

Seems your problem is here:

RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

You need a graphic card with CUDA installed.

Try to make Whisper working alone first, then, WhisperHallu should also work. https://github.com/openai/whisper

EtienneAb3d avatar Dec 11 '23 07:12 EtienneAb3d

I have a NVIDIA GeForce RTX 3070 Ti, after researching a bit, I found out that I needed to install CUDA in a specific way, and this problem was solved. However, another issue has arisen. When I run the final code, I always receive this message (I have the latest version of ffmpeg installed). Could you help me? Thank you.

RuntimeError Traceback (most recent call last) Cell In[19], line 27 15 # Example 16 # lng = "uk" 17 # prompt = "Whisper, Ok. "
(...) 23 # "Ok, Whisper. " 24 # path = "/path/to/your/uk/sound/file" 26 loadModel("0") ---> 27 result = transcribePrompt(path=path, lng=lng, prompt=prompt)

File ~\Downloads\Jupyter\WhisperHalluEnv\WhisperHallu\transcribeHallu.py:201, in transcribePrompt(path, lng, prompt, lngInput, isMusic, addSRT, truncDuration, maxDuration) 199 print("PROMPT="+prompt,flush=True) 200 opts = dict(language=lng,initial_prompt=prompt) --> 201 return transcribeOpts(path, opts,lngInput,isMusic=isMusic,addSRT=addSRT,truncDuration=truncDuration,maxDuration=maxDuration)

File ~\Downloads\Jupyter\WhisperHalluEnv\WhisperHallu\transcribeHallu.py:265, in transcribeOpts(path, opts, lngInput, isMusic, onlySRT, addSRT, truncDuration, maxDuration) 260 pathDemucs=pathIn+".vocals.wav" #demucsDir+"/htdemucs/"+os.path.splitext(os.path.basename(pathIn))[0]+"/vocals.wav" 261 #Demucs seems complex, using CLI cmd for now 262 #aCmd = "python -m demucs --two-stems=vocals -d "+device+":"+cudaIdx+" --out "+demucsDir+" "+pathIn 263 #print("CMD: "+aCmd) 264 #os.system(aCmd) --> 265 demucs_audio(pathIn=pathIn,model=modelDemucs,device="cuda:"+cudaIdx,pathVocals=pathDemucs,pathOther=pathIn+".other.wav") 266 print("T=",(time.time()-startTime)) 267 print("PATH="+pathDemucs,flush=True)

File ~\Downloads\Jupyter\WhisperHalluEnv\WhisperHallu\demucsWrapper.py:43, in demucs_audio(pathIn, model, device, pathVocals, pathOther) 41 source_idx=model.sources.index(name) 42 source=result[0, source_idx].mean(0) ---> 43 torchaudio.save(pathIn+"."+name+".wav", source[None], model.samplerate)

File ~\AppData\Roaming\Python\Python311\site-packages\torchaudio_backend\utils.py:311, in get_save_func..save(uri, src, sample_rate, channels_first, format, encoding, bits_per_sample, buffer_size, backend, compression) 223 def save( 224 uri: Union[BinaryIO, str, os.PathLike], 225 src: torch.Tensor, (...) 233 compression: Optional[Union[CodecConfig, float, int]] = None, 234 ): 235 """Save audio data to file. 236 237 Note: (...) 309 310 """ --> 311 backend = dispatcher(uri, format, backend) 312 return backend.save( 313 uri, src, sample_rate, channels_first, format, encoding, bits_per_sample, buffer_size, compression 314 )

File ~\AppData\Roaming\Python\Python311\site-packages\torchaudio_backend\utils.py:221, in get_save_func..dispatcher(uri, format, backend_name) 219 if backend.can_encode(uri, format): 220 return backend --> 221 raise RuntimeError(f"Couldn't find appropriate backend to handle uri {uri} and format {format}.")

RuntimeError: Couldn't find appropriate backend to handle uri data/KatyPerry-Firework.mp3.WAV.wav.drums.wav and format None.

renatobrusarosco avatar Dec 12 '23 15:12 renatobrusarosco

@renatobrusarosco

First, check that this file exists and is not empty at the moment the error occurs: data/KatyPerry-Firework.mp3.WAV.wav.drums.wav

Each processing step is adding a suffix to the original file path, and is creating a specific LOG file. Check each of these LOG files to see if something is more clear about the problem at one step or an other.

EtienneAb3d avatar Dec 12 '23 15:12 EtienneAb3d

For people facing this issue without GPU, here's how you can change it to CPU.

In https://github.com/EtienneAb3d/WhisperHallu/blob/main/transcribeHallu.py#L110, set device to cpu:

model = whisper.load_model(modelSize,device=torch.device("cpu"))

Same thing in https://github.com/EtienneAb3d/WhisperHallu/blob/main/transcribeHallu.py#L265:

demucs_audio(pathIn=pathIn,model=modelDemucs,device="cpu",pathVocals=pathDemucs,pathOther=pathIn+".other.wav")

You will need PyTorch compiled for CPU, I did:

pip uninstall torch
pip install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu

It will take some time, but works.

Tested on macOS Sonoma 14.1 M2

gmmarc avatar Jan 05 '24 10:01 gmmarc