WhisperHallu Can't load Whisper Model

Hi there,

I don't have a lot of knowledge about programming, and I'm struggling to deal with your tool. I tried to use the tool in Jupyter Anaconda. I installed all the required libraries, but when I attempted to run the final code, I encountered the same error message multiple times:

vbnet Copy code Python >= 3.10 Using cache found in C:\Users\renat/.cache\torch\hub\snakers4_silero-vad_master Using Demucs Using standard Whisper LOADING: medium GPU:0 BS: 2 100%|█████████████████████████████████████| 1.42G/1.42G [00:25<00:00, 59.9MiB/s] Can't load Whisper model: STD/medium

Below you can find more information:

RuntimeError Traceback (most recent call last) File ~\Downloads\Jupyter\WhisperHallu\transcribeHallu.py:110, in loadModel(gpu, modelSize) 109 print("LOADING: "+modelSize+" GPU:"+gpu+" BS: "+str(beam_size)) --> 110 model = whisper.load_model(modelSize,device=torch.device("cuda:"+gpu)) #May be "cpu" 111 elif whisperFound == "SM4T":

File ~\AppData\Roaming\Python\Python311\site-packages\whisper_init_.py:146, in load_model(name, device, download_root, in_memory) 143 with ( 144 io.BytesIO(checkpoint_file) if in_memory else open(checkpoint_file, "rb") 145 ) as fp: --> 146 checkpoint = torch.load(fp, map_location=device) 147 del checkpoint_file

File ~\AppData\Roaming\Python\Python311\site-packages\torch\serialization.py:1014, in load(f, map_location, pickle_module, weights_only, mmap, **pickle_load_args) 1013 raise pickle.UnpicklingError(UNSAFE_MESSAGE + str(e)) from None -> 1014 return _load(opened_zipfile, 1015 map_location, 1016 pickle_module, 1017 overall_storage=overall_storage, 1018 **pickle_load_args) 1019 if mmap:

File ~\AppData\Roaming\Python\Python311\site-packages\torch\serialization.py:1422, in _load(zip_file, map_location, pickle_module, pickle_file, overall_storage, **pickle_load_args) 1421 unpickler.persistent_load = persistent_load -> 1422 result = unpickler.load() 1424 torch._utils._validate_loaded_sparse_tensors()

File ~\AppData\Roaming\Python\Python311\site-packages\torch\serialization.py:1392, in _load..persistent_load(saved_id) 1391 nbytes = numel * torch._utils._element_size(dtype) -> 1392 typed_storage = load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location)) 1394 return typed_storage

File ~\AppData\Roaming\Python\Python311\site-packages\torch\serialization.py:1366, in _load..load_tensor(dtype, numel, key, location) 1363 # TODO: Once we decide to break serialization FC, we can 1364 # stop wrapping with TypedStorage 1365 typed_storage = torch.storage.TypedStorage( -> 1366 wrap_storage=restore_location(storage, location), 1367 dtype=dtype, 1368 _internal=True) 1370 if typed_storage._data_ptr() != 0:

File ~\AppData\Roaming\Python\Python311\site-packages\torch\serialization.py:1299, in _get_restore_location..restore_location(storage, location) 1298 def restore_location(storage, location): -> 1299 return default_restore_location(storage, str(map_location))

File ~\AppData\Roaming\Python\Python311\site-packages\torch\serialization.py:381, in default_restore_location(storage, location) 380 for _, _, fn in _package_registry: --> 381 result = fn(storage, location) 382 if result is not None:

File ~\AppData\Roaming\Python\Python311\site-packages\torch\serialization.py:274, in _cuda_deserialize(obj, location) 273 if location.startswith('cuda'): --> 274 device = validate_cuda_device(location) 275 if getattr(obj, "_torch_load_uninitialized", False):

File ~\AppData\Roaming\Python\Python311\site-packages\torch\serialization.py:258, in validate_cuda_device(location) 257 if not torch.cuda.is_available(): --> 258 raise RuntimeError('Attempting to deserialize object on a CUDA ' 259 'device but torch.cuda.is_available() is False. ' 260 'If you are running on a CPU-only machine, ' 261 'please use torch.load with map_location=torch.device('cpu') ' 262 'to map your storages to the CPU.') 263 device_count = torch.cuda.device_count()

RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

During handling of the above exception, another exception occurred:

SystemExit Traceback (most recent call last) [... skipping hidden 1 frame]

Cell In[3], line 26 15 #Example 16 #lng="uk" 17 #prompt= "Whisper, Ok. "
(...) 23 # +"Ok, Whisper. " 24 #path="/path/to/your/uk/sound/file" ---> 26 loadModel("0") 27 result = transcribePrompt(path=path, lng=lng, prompt=prompt)

File ~\Downloads\Jupyter\WhisperHallu\transcribeHallu.py:117, in loadModel(gpu, modelSize) 116 print("Can't load Whisper model: "+whisperFound+"/"+modelSize) --> 117 sys.exit(-1)

SystemExit: -1

During handling of the above exception, another exception occurred:

AttributeError Traceback (most recent call last) [... skipping hidden 1 frame]

File ~\anaconda3\Lib\site-packages\IPython\core\interactiveshell.py:2097, in InteractiveShell.showtraceback(self, exc_tuple, filename, tb_offset, exception_only, running_compiled_code) 2094 if exception_only: 2095 stb = ['An exception has occurred, use %tb to see ' 2096 'the full traceback.\n'] -> 2097 stb.extend(self.InteractiveTB.get_exception_only(etype, 2098 value)) 2099 else: 2101 def contains_exceptiongroup(val):

File ~\anaconda3\Lib\site-packages\IPython\core\ultratb.py:710, in ListTB.get_exception_only(self, etype, value) 702 def get_exception_only(self, etype, value): 703 """Only print the exception type and message, without a traceback. 704 705 Parameters (...) 708 value : exception value 709 """ --> 710 return ListTB.structured_traceback(self, etype, value)

File ~\anaconda3\Lib\site-packages\IPython\core\ultratb.py:568, in ListTB.structured_traceback(self, etype, evalue, etb, tb_offset, context) 565 chained_exc_ids.add(id(exception[1])) 566 chained_exceptions_tb_offset = 0 567 out_list = ( --> 568 self.structured_traceback( 569 etype, 570 evalue, 571 (etb, chained_exc_ids), # type: ignore 572 chained_exceptions_tb_offset, 573 context, 574 ) 575 + chained_exception_message 576 + out_list) 578 return out_list

File ~\anaconda3\Lib\site-packages\IPython\core\ultratb.py:1435, in AutoFormattedTB.structured_traceback(self, etype, evalue, etb, tb_offset, number_of_lines_of_context) 1433 else: 1434 self.tb = etb -> 1435 return FormattedTB.structured_traceback( 1436 self, etype, evalue, etb, tb_offset, number_of_lines_of_context 1437 )

File ~\anaconda3\Lib\site-packages\IPython\core\ultratb.py:1326, in FormattedTB.structured_traceback(self, etype, value, tb, tb_offset, number_of_lines_of_context) 1323 mode = self.mode 1324 if mode in self.verbose_modes: 1325 # Verbose modes need a full traceback -> 1326 return VerboseTB.structured_traceback( 1327 self, etype, value, tb, tb_offset, number_of_lines_of_context 1328 ) 1329 elif mode == 'Minimal': 1330 return ListTB.get_exception_only(self, etype, value)

File ~\anaconda3\Lib\site-packages\IPython\core\ultratb.py:1173, in VerboseTB.structured_traceback(self, etype, evalue, etb, tb_offset, number_of_lines_of_context) 1164 def structured_traceback( 1165 self, 1166 etype: type, (...) 1170 number_of_lines_of_context: int = 5, 1171 ): 1172 """Return a nice text document describing the traceback.""" -> 1173 formatted_exception = self.format_exception_as_a_whole(etype, evalue, etb, number_of_lines_of_context, 1174 tb_offset) 1176 colors = self.Colors # just a shorthand + quicker name lookup 1177 colorsnormal = colors.Normal # used a lot

File ~\anaconda3\Lib\site-packages\IPython\core\ultratb.py:1063, in VerboseTB.format_exception_as_a_whole(self, etype, evalue, etb, number_of_lines_of_context, tb_offset) 1060 assert isinstance(tb_offset, int) 1061 head = self.prepare_header(str(etype), self.long_header) 1062 records = ( -> 1063 self.get_records(etb, number_of_lines_of_context, tb_offset) if etb else [] 1064 ) 1066 frames = [] 1067 skipped = 0

File ~\anaconda3\Lib\site-packages\IPython\core\ultratb.py:1131, in VerboseTB.get_records(self, etb, number_of_lines_of_context, tb_offset) 1129 while cf is not None: 1130 try: -> 1131 mod = inspect.getmodule(cf.tb_frame) 1132 if mod is not None: 1133 mod_name = mod.name

AttributeError: 'tuple' object has no attribute 'tb_frame'

In my research, I discovered that the issue is related to something with the GPU. Here are my PC specifications: AMD Ryzen 7 6800H with Radeon Graphics 3.20 GHz 16.0 GB RAM NVIDIA GeForce RTX 3070 Ti Thank you in advance.

Dec 09 '23 17:12 renatobrusarosco

@renatobrusarosco

Seems your problem is here:

RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

You need a graphic card with CUDA installed.

Try to make Whisper working alone first, then, WhisperHallu should also work. https://github.com/openai/whisper

Dec 11 '23 07:12 EtienneAb3d

I have a NVIDIA GeForce RTX 3070 Ti, after researching a bit, I found out that I needed to install CUDA in a specific way, and this problem was solved. However, another issue has arisen. When I run the final code, I always receive this message (I have the latest version of ffmpeg installed). Could you help me? Thank you.

RuntimeError Traceback (most recent call last) Cell In[19], line 27 15 # Example 16 # lng = "uk" 17 # prompt = "Whisper, Ok. "
(...) 23 # "Ok, Whisper. " 24 # path = "/path/to/your/uk/sound/file" 26 loadModel("0") ---> 27 result = transcribePrompt(path=path, lng=lng, prompt=prompt)

File ~\Downloads\Jupyter\WhisperHalluEnv\WhisperHallu\transcribeHallu.py:201, in transcribePrompt(path, lng, prompt, lngInput, isMusic, addSRT, truncDuration, maxDuration) 199 print("PROMPT="+prompt,flush=True) 200 opts = dict(language=lng,initial_prompt=prompt) --> 201 return transcribeOpts(path, opts,lngInput,isMusic=isMusic,addSRT=addSRT,truncDuration=truncDuration,maxDuration=maxDuration)

File ~\Downloads\Jupyter\WhisperHalluEnv\WhisperHallu\transcribeHallu.py:265, in transcribeOpts(path, opts, lngInput, isMusic, onlySRT, addSRT, truncDuration, maxDuration) 260 pathDemucs=pathIn+".vocals.wav" #demucsDir+"/htdemucs/"+os.path.splitext(os.path.basename(pathIn))[0]+"/vocals.wav" 261 #Demucs seems complex, using CLI cmd for now 262 #aCmd = "python -m demucs --two-stems=vocals -d "+device+":"+cudaIdx+" --out "+demucsDir+" "+pathIn 263 #print("CMD: "+aCmd) 264 #os.system(aCmd) --> 265 demucs_audio(pathIn=pathIn,model=modelDemucs,device="cuda:"+cudaIdx,pathVocals=pathDemucs,pathOther=pathIn+".other.wav") 266 print("T=",(time.time()-startTime)) 267 print("PATH="+pathDemucs,flush=True)

File ~\Downloads\Jupyter\WhisperHalluEnv\WhisperHallu\demucsWrapper.py:43, in demucs_audio(pathIn, model, device, pathVocals, pathOther) 41 source_idx=model.sources.index(name) 42 source=result[0, source_idx].mean(0) ---> 43 torchaudio.save(pathIn+"."+name+".wav", source[None], model.samplerate)

File ~\AppData\Roaming\Python\Python311\site-packages\torchaudio_backend\utils.py:311, in get_save_func..save(uri, src, sample_rate, channels_first, format, encoding, bits_per_sample, buffer_size, backend, compression) 223 def save( 224 uri: Union[BinaryIO, str, os.PathLike], 225 src: torch.Tensor, (...) 233 compression: Optional[Union[CodecConfig, float, int]] = None, 234 ): 235 """Save audio data to file. 236 237 Note: (...) 309 310 """ --> 311 backend = dispatcher(uri, format, backend) 312 return backend.save( 313 uri, src, sample_rate, channels_first, format, encoding, bits_per_sample, buffer_size, compression 314 )

File ~\AppData\Roaming\Python\Python311\site-packages\torchaudio_backend\utils.py:221, in get_save_func..dispatcher(uri, format, backend_name) 219 if backend.can_encode(uri, format): 220 return backend --> 221 raise RuntimeError(f"Couldn't find appropriate backend to handle uri {uri} and format {format}.")

RuntimeError: Couldn't find appropriate backend to handle uri data/KatyPerry-Firework.mp3.WAV.wav.drums.wav and format None.

Dec 12 '23 15:12 renatobrusarosco

@renatobrusarosco

First, check that this file exists and is not empty at the moment the error occurs: data/KatyPerry-Firework.mp3.WAV.wav.drums.wav

Each processing step is adding a suffix to the original file path, and is creating a specific LOG file. Check each of these LOG files to see if something is more clear about the problem at one step or an other.

Dec 12 '23 15:12 EtienneAb3d

For people facing this issue without GPU, here's how you can change it to CPU.

In https://github.com/EtienneAb3d/WhisperHallu/blob/main/transcribeHallu.py#L110, set device to cpu:

model = whisper.load_model(modelSize,device=torch.device("cpu"))

Same thing in https://github.com/EtienneAb3d/WhisperHallu/blob/main/transcribeHallu.py#L265:

demucs_audio(pathIn=pathIn,model=modelDemucs,device="cpu",pathVocals=pathDemucs,pathOther=pathIn+".other.wav")

You will need PyTorch compiled for CPU, I did:

pip uninstall torch
pip install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu

It will take some time, but works.

Tested on macOS Sonoma 14.1 M2

Jan 05 '24 10:01 gmmarc

WhisperHallu WhisperHallu copied to clipboard

Can't load Whisper Model

WhisperHallu
WhisperHallu copied to clipboard