blog
blog copied to clipboard
Error when fine tuning whisper model
I followed this blog post and used it to fine tune the whisper model using a custom data set, but after training when trying to run this command
trainer.push_to_hub(**kwargs)
it throws this error
HTTPError Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_errors.py](https://localhost:8080/#) in hf_raise_for_status(response, endpoint_name)
260 try:
--> 261 response.raise_for_status()
262 except HTTPError as e:
9 frames
HTTPError: 400 Client Error: Bad Request for url: https://huggingface.co/api/models/valacodes/whisper-small-hausa/commit/main
The above exception was the direct cause of the following exception:
BadRequestError Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_errors.py](https://localhost:8080/#) in hf_raise_for_status(response, endpoint_name)
297 f"\n\nBad request for {endpoint_name} endpoint:" if endpoint_name is not None else "\n\nBad request:"
298 )
--> 299 raise BadRequestError(message, response=response) from e
300
301 # Convert HTTPError into a HfHubHTTPError to display request information
BadRequestError: (Request ID: Root=1-65271724-79a6b33830e49217395944e2;736a08e9-3998-4e6f-b43e-86df049f04ed)
Bad request for commit endpoint:
"model-index[0].results[0].dataset.config" must be a string
and visiting the hf-speech-bench webpage shows this
TypeError: string indices must be integers
Traceback:
File "/home/user/.local/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 552, in _run_script
exec(code, module.__dict__)
File "/home/user/app/app.py", line 143, in <module>
dataframe = get_data()
File "/home/user/.local/lib/python3.10/site-packages/streamlit/runtime/legacy_caching/caching.py", line 715, in wrapped_func
return get_or_create_cached_value()
File "/home/user/.local/lib/python3.10/site-packages/streamlit/runtime/legacy_caching/caching.py", line 696, in get_or_create_cached_value
return_value = non_optional_func(*args, **kwargs)
File "/home/user/app/app.py", line 107, in get_data
for row in parse_metrics_rows(meta):
File "/home/user/app/app.py", line 72, in parse_metrics_rows
lang = result["dataset"]["args"]["language"]
```
@valaofficial I have the same error, any news?
cc @sanchit-gandhi :)
I came here after having some issues figuring out kwargs and what was expected through the push_to_hub method.
However I did manage to publish the model and use it with gradio by adding the following:
trainer.save_model()
trainer.push_to_hub()
tokenizer.push_to_hub("username/model-id")
Doc Links
Has anyone found a solution to the issue? I am experiencing the same problem.
I assume you're also following this notebook: https://colab.research.google.com/github/sanchit-gandhi/notebooks/blob/main/fine_tune_whisper.ipynb. Commenting out "dataset_tags" from **kwargs worked for me, although I'm not sure why
same problem when I follow the Audio Course Unit 4.