Retrieval-based-Voice-Conversion-WebUI Error when using Crepe for inference

Whenever I use Crepe for inference I get the below error:

Traceback (most recent call last):
  File "C:\Users\doop\Retrieval-based-Voice-Conversion-WebUI\infer-web.py", line 184, in vc_single
    audio_opt = vc.pipeline(
  File "C:\Users\doop\Retrieval-based-Voice-Conversion-WebUI\vc_infer_pipeline.py", line 328, in pipeline
    pitch, pitchf = self.get_f0(
  File "C:\Users\doop\Retrieval-based-Voice-Conversion-WebUI\vc_infer_pipeline.py", line 112, in get_f0
    f0, pd = torchcrepe.predict(
  File "C:\Users\doop\Retrieval-based-Voice-Conversion-WebUI\venv\lib\site-packages\torchcrepe\core.py", line 127, in predict
    result = postprocess(probabilities,
  File "C:\Users\doop\Retrieval-based-Voice-Conversion-WebUI\venv\lib\site-packages\torchcrepe\core.py", line 605, in postprocess
    bins, pitch = decoder(probabilities)
  File "C:\Users\doop\Retrieval-based-Voice-Conversion-WebUI\venv\lib\site-packages\torchcrepe\decode.py", line 76, in viterbi
    bins = torch.tensor(bins, device=probs.device)
TypeError: can't convert np.ndarray of type numpy.uint16. The only supported types are: float64, float32, float16, complex64, complex128, int64, int32, int16, int8, uint8, and bool.

Traceback (most recent call last):
  File "C:\Users\doop\Retrieval-based-Voice-Conversion-WebUI\venv\lib\site-packages\gradio\routes.py", line 393, in run_predict
    output = await app.get_blocks().process_api(
  File "C:\Users\doop\Retrieval-based-Voice-Conversion-WebUI\venv\lib\site-packages\gradio\blocks.py", line 1111, in process_api
    data = self.postprocess_data(fn_index, result["prediction"], state)
  File "C:\Users\doop\Retrieval-based-Voice-Conversion-WebUI\venv\lib\site-packages\gradio\blocks.py", line 1045, in postprocess_data
    prediction_value = block.postprocess(prediction_value)
  File "C:\Users\doop\Retrieval-based-Voice-Conversion-WebUI\venv\lib\site-packages\gradio\components.py", line 2423, in postprocess
    processing_utils.audio_to_file(sample_rate, data, file.name)
  File "C:\Users\doop\Retrieval-based-Voice-Conversion-WebUI\venv\lib\site-packages\gradio\processing_utils.py", line 160, in audio_to_file
    data = convert_to_16_bit_wav(data)
  File "C:\Users\doop\Retrieval-based-Voice-Conversion-WebUI\venv\lib\site-packages\gradio\processing_utils.py", line 174, in convert_to_16_bit_wav
    if data.dtype in [np.float64, np.float32, np.float16]:
AttributeError: 'NoneType' object has no attribute 'dtype'

I've tried different models and reinstalling and It still gives me the error. PM and Harvest both work fine.

Running with a GTX 1070 and an i7-8700k

May 29 '23 05:05 JonaldJohnston

Would you mind tell me the version of torchcrepe?

May 29 '23 07:05 RVC-Boss

And windows or linux?

May 29 '23 07:05 RVC-Boss

similar problem here. python 3.10.11, arch linux, gtx 2080ti

May 29 19:02:22 rvc python3[32693]: Traceback (most recent call last):
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/infer-web.py", line 184, in vc_single
May 29 19:02:22 rvc python3[32693]:     audio_opt = vc.pipeline(
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/vc_infer_pipeline.py", line 328, in pipeline
May 29 19:02:22 rvc python3[32693]:     pitch, pitchf = self.get_f0(
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/vc_infer_pipeline.py", line 112, in get_f0
May 29 19:02:22 rvc python3[32693]:     f0, pd = torchcrepe.predict(
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/venv/lib/python3.10/site-packages/torchcrepe/core.py", line 127, in predict
May 29 19:02:22 rvc python3[32693]:     result = postprocess(probabilities,
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/venv/lib/python3.10/site-packages/torchcrepe/core.py", line 605, in postprocess
May 29 19:02:22 rvc python3[32693]:     bins, pitch = decoder(probabilities)
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/venv/lib/python3.10/site-packages/torchcrepe/decode.py", line 76, in viterbi
May 29 19:02:22 rvc python3[32693]:     bins = torch.tensor(bins, device=probs.device)
May 29 19:02:22 rvc python3[32693]: TypeError: can't convert np.ndarray of type numpy.uint16. The only supported types are: float64, float32, float16, complex64, complex128, int64, int32, int16, int8, uint8, and bool.
May 29 19:02:22 rvc python3[32693]: Traceback (most recent call last):
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/infer-web.py", line 184, in vc_single
May 29 19:02:22 rvc python3[32693]:     audio_opt = vc.pipeline(
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/vc_infer_pipeline.py", line 328, in pipeline
May 29 19:02:22 rvc python3[32693]:     pitch, pitchf = self.get_f0(
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/vc_infer_pipeline.py", line 112, in get_f0
May 29 19:02:22 rvc python3[32693]:     f0, pd = torchcrepe.predict(
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/venv/lib/python3.10/site-packages/torchcrepe/core.py", line 127, in predict
May 29 19:02:22 rvc python3[32693]:     result = postprocess(probabilities,
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/venv/lib/python3.10/site-packages/torchcrepe/core.py", line 605, in postprocess
May 29 19:02:22 rvc python3[32693]:     bins, pitch = decoder(probabilities)
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/venv/lib/python3.10/site-packages/torchcrepe/decode.py", line 76, in viterbi
May 29 19:02:22 rvc python3[32693]:     bins = torch.tensor(bins, device=probs.device)
May 29 19:02:22 rvc python3[32693]: TypeError: can't convert np.ndarray of type numpy.uint16. The only supported types are: float64, float32, float16, complex64, complex128, int64, int32, int16, int8, uint8, and bool.
May 29 19:02:22 rvc python3[32693]: Traceback (most recent call last):
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/infer-web.py", line 184, in vc_single
May 29 19:02:22 rvc python3[32693]:     audio_opt = vc.pipeline(
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/vc_infer_pipeline.py", line 328, in pipeline
May 29 19:02:22 rvc python3[32693]:     pitch, pitchf = self.get_f0(
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/vc_infer_pipeline.py", line 112, in get_f0
May 29 19:02:22 rvc python3[32693]:     f0, pd = torchcrepe.predict(
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/venv/lib/python3.10/site-packages/torchcrepe/core.py", line 127, in predict
May 29 19:02:22 rvc python3[32693]:     result = postprocess(probabilities,
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/venv/lib/python3.10/site-packages/torchcrepe/core.py", line 605, in postprocess
May 29 19:02:22 rvc python3[32693]:     bins, pitch = decoder(probabilities)
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/venv/lib/python3.10/site-packages/torchcrepe/decode.py", line 76, in viterbi
May 29 19:02:22 rvc python3[32693]:     bins = torch.tensor(bins, device=probs.device)
May 29 19:02:22 rvc python3[32693]: TypeError: can't convert np.ndarray of type numpy.uint16. The only supported types are: float64, float32, float16, complex64, complex128, int64, int32, int16, int8, uint8, and bool.
May 29 19:02:22 rvc python3[32693]: loading weights/moist.pth
May 29 19:02:22 rvc python3[32693]: gin_channels: 256 self.spk_embed_dim: 109
May 29 19:02:22 rvc python3[32693]: <All keys matched successfully>
May 29 19:02:22 rvc python3[32693]: Traceback (most recent call last):
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/infer-web.py", line 184, in vc_single
May 29 19:02:22 rvc python3[32693]:     audio_opt = vc.pipeline(
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/vc_infer_pipeline.py", line 328, in pipeline
May 29 19:02:22 rvc python3[32693]:     pitch, pitchf = self.get_f0(
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/vc_infer_pipeline.py", line 112, in get_f0
May 29 19:02:22 rvc python3[32693]:     f0, pd = torchcrepe.predict(
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/venv/lib/python3.10/site-packages/torchcrepe/core.py", line 127, in predict
May 29 19:02:22 rvc python3[32693]:     result = postprocess(probabilities,
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/venv/lib/python3.10/site-packages/torchcrepe/core.py", line 605, in postprocess
May 29 19:02:22 rvc python3[32693]:     bins, pitch = decoder(probabilities)
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/venv/lib/python3.10/site-packages/torchcrepe/decode.py", line 76, in viterbi
May 29 19:02:22 rvc python3[32693]:     bins = torch.tensor(bins, device=probs.device)
May 29 19:02:22 rvc python3[32693]: TypeError: can't convert np.ndarray of type numpy.uint16. The only supported types are: float64, float32, float16, complex64, complex128, int64, int32, int16, int8, uint8, and bool.
May 29 19:02:22 rvc python3[32693]: Traceback (most recent call last):
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/infer-web.py", line 184, in vc_single
May 29 19:02:22 rvc python3[32693]:     audio_opt = vc.pipeline(
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/vc_infer_pipeline.py", line 328, in pipeline
May 29 19:02:22 rvc python3[32693]:     pitch, pitchf = self.get_f0(
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/vc_infer_pipeline.py", line 112, in get_f0
May 29 19:02:22 rvc python3[32693]:     f0, pd = torchcrepe.predict(
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/venv/lib/python3.10/site-packages/torchcrepe/core.py", line 127, in predict
May 29 19:02:22 rvc python3[32693]:     result = postprocess(probabilities,
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/venv/lib/python3.10/site-packages/torchcrepe/core.py", line 605, in postprocess
May 29 19:02:22 rvc python3[32693]:     bins, pitch = decoder(probabilities)
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/venv/lib/python3.10/site-packages/torchcrepe/decode.py", line 76, in viterbi
May 29 19:02:22 rvc python3[32693]:     bins = torch.tensor(bins, device=probs.device)
May 29 19:02:22 rvc python3[32693]: TypeError: can't convert np.ndarray of type numpy.uint16. The only supported types are: float64, float32, float16, complex64, complex128, int64, int32, int16, int8, uint8, and bool.
May 29 19:02:22 rvc python3[32693]: loading weights/ayaka-jp.pth
May 29 19:02:22 rvc python3[32693]: gin_channels: 256 self.spk_embed_dim: 109
May 29 19:02:22 rvc python3[32693]: <All keys matched successfully>
May 29 19:02:22 rvc python3[32693]: Traceback (most recent call last):
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/infer-web.py", line 184, in vc_single
May 29 19:02:22 rvc python3[32693]:     audio_opt = vc.pipeline(
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/vc_infer_pipeline.py", line 328, in pipeline
May 29 19:02:22 rvc python3[32693]:     pitch, pitchf = self.get_f0(
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/vc_infer_pipeline.py", line 112, in get_f0
May 29 19:02:22 rvc python3[32693]:     f0, pd = torchcrepe.predict(
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/venv/lib/python3.10/site-packages/torchcrepe/core.py", line 127, in predict
May 29 19:02:22 rvc python3[32693]:     result = postprocess(probabilities,
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/venv/lib/python3.10/site-packages/torchcrepe/core.py", line 605, in postprocess
May 29 19:02:22 rvc python3[32693]:     bins, pitch = decoder(probabilities)
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/venv/lib/python3.10/site-packages/torchcrepe/decode.py", line 76, in viterbi
May 29 19:02:22 rvc python3[32693]:     bins = torch.tensor(bins, device=probs.device)
May 29 19:02:22 rvc python3[32693]: TypeError: can't convert np.ndarray of type numpy.uint16. The only supported types are: float64, float32, float16, complex64, complex128, int64, int32, int16, int8, uint8, and bool.
May 29 19:02:22 rvc python3[32693]: Traceback (most recent call last):
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/venv/lib/python3.10/site-packages/gradio/routes.py", line 422, in run_predict
May 29 19:02:22 rvc python3[32693]:     output = await app.get_blocks().process_api(
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1326, in process_api
May 29 19:02:22 rvc python3[32693]:     data = self.postprocess_data(fn_index, result["prediction"], state)
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1260, in postprocess_data
May 29 19:02:22 rvc python3[32693]:     prediction_value = block.postprocess(prediction_value)
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/venv/lib/python3.10/site-packages/gradio/components.py", line 2586, in postprocess
May 29 19:02:22 rvc python3[32693]:     file_path = self.audio_to_temp_file(
May 29 19:02:22 rvc python3[32693]:   File "/home/username/Retrieval-based-Voice-Conversion-WebUI/venv/lib/python3.10/site-packages/gradio/components.py", line 360, in audio_to_temp_file
May 29 19:02:22 rvc python3[32693]:     temp_dir = Path(dir) / self.hash_bytes(data.tobytes())
May 29 19:02:22 rvc python3[32693]: AttributeError: 'NoneType' object has no attribute 'tobytes'

May 29 '23 16:05 reyafyi

Would you mind tell me the version of torchcrepe?

pip says torchcrepe is version 0.0.15

also, I am running Windows 10

May 29 '23 18:05 JonaldJohnston

Would you mind tell me the version of torchcrepe?

pip says torchcrepe is version 0.0.15

also, I am running Windows 10

Had the same problem,you would need the latest torchcrepe version 0.0.19 to inference with crepe

try this command:

pip install --upgrade --no-deps --force-reinstall torchcrepe

Solution source: https://stackoverflow.com/a/27254355

May 30 '23 02:05 rikimtasu

try torchcrepe==0.0.18

May 30 '23 02:05 RVC-Boss

Installing torchcrepe 0.0.18 seems to have worked, thanks!

May 30 '23 05:05 JonaldJohnston

This fix isn't working anymore :( any other ideas?

Jun 14 '23 22:06 joeprice20

Would you mind tell me the version of torchcrepe?

pip says torchcrepe is version 0.0.15 also, I am running Windows 10

Had the same problem,you would need the latest torchcrepe version 0.0.19 to inference with crepe

try this command:

pip install --upgrade --no-deps --force-reinstall torchcrepe

Solution source: https://stackoverflow.com/a/27254355

Thanks, this fixed the issue for me !

Jul 05 '23 00:07 Skizzie

Same problem here.

Aug 18 '23 20:08 yanakill

Retrieval-based-Voice-Conversion-WebUI Retrieval-based-Voice-Conversion-WebUI copied to clipboard

Error when using Crepe for inference

Retrieval-based-Voice-Conversion-WebUI
Retrieval-based-Voice-Conversion-WebUI copied to clipboard