audiocraft
audiocraft copied to clipboard
IndexError: index 4 is out of range
I'd like to try your pre-trained stereo models. But when I generate the sample I get this error. I'm using MusicGen's demo jupyter notebook to create audio.
Please share your callstack, env, other details relevant to reproduce. It could be that you just need to re-install audiocraft, see: https://github.com/facebookresearch/audiocraft?tab=readme-ov-file#installation
Hello, I get the same error when generating music with stereo model and using multi band diffusion. Is multiband diffusion not supported with stereo models? MBD works on old mono models and stereo models work without MBD. I would just like to combine MBD with stereo model to get the best output quality. I'm using auodiocraft on the following front end. https://github.com/rsxdalv/tts-generation-webui
Python 3.10.9 Main dependency versions: audiocraft 1.3.0a1 torch 2.1.2+cu121 torchaudio 2.1.2+cu121 xformers 0.0.23.post1 I also tested with torch 2.0.0 with xformers 0.0.20 and got the same error.
Parameters used in the test run: text : 80s synth pop melody : None model : facebook/musicgen-stereo-large duration : 1 topk : 250 topp : 0 temperature : 1 cfg_coef : 3 seed : 3792762101 use_multi_band_diffusion : True
Callstack: Traceback (most recent call last): File "Audiocraft\torch_201Venv\lib\site-packages\gradio\queueing.py", line 407, in call_prediction output = await route_utils.call_process_api( File "Audiocraft\torch_201Venv\lib\site-packages\gradio\route_utils.py", line 226, in call_process_api output = await app.get_blocks().process_api( File "Audiocraft\torch_201Venv\lib\site-packages\gradio\blocks.py", line 1550, in process_api result = await self.call_function( File "Audiocraft\torch_201Venv\lib\site-packages\gradio\blocks.py", line 1185, in call_function prediction = await anyio.to_thread.run_sync( File "Audiocraft\torch_201Venv\lib\site-packages\anyio\to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( File "Audiocraft\torch_201Venv\lib\site-packages\anyio_backends_asyncio.py", line 2134, in run_sync_in_worker_thread return await future File "Audiocraft\torch_201Venv\lib\site-packages\anyio_backends_asyncio.py", line 851, in run result = context.run(func, *args) File "Audiocraft\torch_201Venv\lib\site-packages\gradio\utils.py", line 661, in wrapper response = f(*args, **kwargs) File "Audiocraft\tts-generation-webui\src\musicgen\musicgen_tab.py", line 209, in generate wav_diffusion = mbd.tokens_to_wav(tokens, 32) File "Audiocraft\torch_201Venv\lib\site-packages\audiocraft\models\multibanddiffusion.py", line 188, in tokens_to_wav wav_encodec = self.codec_model.decode(tokens) File "Audiocraft\torch_201Venv\lib\site-packages\audiocraft\models\encodec.py", line 251, in decode emb = self.decode_latent(codes) File "Audiocraft\torch_201Venv\lib\site-packages\audiocraft\models\encodec.py", line 259, in decode_latent return self.quantizer.decode(codes) File "Audiocraft\torch_201Venv\lib\site-packages\audiocraft\quantization\vq.py", line 102, in decode quantized = self.vq.decode(codes) File "Audiocraft\torch_201Venv\lib\site-packages\audiocraft\quantization\core_vq.py", line 402, in decode layer = self.layers[i] File "Audiocraft\torch_201Venv\lib\site-packages\torch\nn\modules\container.py", line 293, in getitem return self._modules[self._get_abs_string_index(idx)] File "Audiocraft\torch_201Venv\lib\site-packages\torch\nn\modules\container.py", line 283, in _get_abs_string_index raise IndexError(f'index {idx} is out of range') IndexError: index 4 is out of range
I am coming to the conclusion that multiband diffusion does not support stereo models since the dimensions in the tokens tensor are different between mono and stereo models.
I also have the same situation with "facebook/musicgen-stereo-melody-large" model. If I switch to "facebook/musicgen-melody-large" no error is raised. It would be much better to use the stereo model though.
Hello, I get the same error when generating music with stereo model and using multi band diffusion. Is multiband diffusion not supported with stereo models? MBD works on old mono models and stereo models work without MBD. I would just like to combine MBD with stereo model to get the best output quality. I'm using auodiocraft on the following front end. https://github.com/rsxdalv/tts-generation-webui
Python 3.10.9 Main dependency versions: audiocraft 1.3.0a1 torch 2.1.2+cu121 torchaudio 2.1.2+cu121 xformers 0.0.23.post1 I also tested with torch 2.0.0 with xformers 0.0.20 and got the same error.
Parameters used in the test run: text : 80s synth pop melody : None model : facebook/musicgen-stereo-large duration : 1 topk : 250 topp : 0 temperature : 1 cfg_coef : 3 seed : 3792762101 use_multi_band_diffusion : True
Callstack: Traceback (most recent call last): File "Audiocraft\torch_201Venv\lib\site-packages\gradio\queueing.py", line 407, in call_prediction output = await route_utils.call_process_api( File "Audiocraft\torch_201Venv\lib\site-packages\gradio\route_utils.py", line 226, in call_process_api output = await app.get_blocks().process_api( File "Audiocraft\torch_201Venv\lib\site-packages\gradio\blocks.py", line 1550, in process_api result = await self.call_function( File "Audiocraft\torch_201Venv\lib\site-packages\gradio\blocks.py", line 1185, in call_function prediction = await anyio.to_thread.run_sync( File "Audiocraft\torch_201Venv\lib\site-packages\anyio\to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( File "Audiocraft\torch_201Venv\lib\site-packages\anyio_backends_asyncio.py", line 2134, in run_sync_in_worker_thread return await future File "Audiocraft\torch_201Venv\lib\site-packages\anyio_backends_asyncio.py", line 851, in run result = context.run(func, *args) File "Audiocraft\torch_201Venv\lib\site-packages\gradio\utils.py", line 661, in wrapper response = f(*args, **kwargs) File "Audiocraft\tts-generation-webui\src\musicgen\musicgen_tab.py", line 209, in generate wav_diffusion = mbd.tokens_to_wav(tokens, 32) File "Audiocraft\torch_201Venv\lib\site-packages\audiocraft\models\multibanddiffusion.py", line 188, in tokens_to_wav wav_encodec = self.codec_model.decode(tokens) File "Audiocraft\torch_201Venv\lib\site-packages\audiocraft\models\encodec.py", line 251, in decode emb = self.decode_latent(codes) File "Audiocraft\torch_201Venv\lib\site-packages\audiocraft\models\encodec.py", line 259, in decode_latent return self.quantizer.decode(codes) File "Audiocraft\torch_201Venv\lib\site-packages\audiocraft\quantization\vq.py", line 102, in decode quantized = self.vq.decode(codes) File "Audiocraft\torch_201Venv\lib\site-packages\audiocraft\quantization\core_vq.py", line 402, in decode layer = self.layers[i] File "Audiocraft\torch_201Venv\lib\site-packages\torch\nn\modules\container.py", line 293, in getitem return self._modules[self._get_abs_string_index(idx)] File "Audiocraft\torch_201Venv\lib\site-packages\torch\nn\modules\container.py", line 283, in _get_abs_string_index raise IndexError(f'index {idx} is out of range') IndexError: index 4 is out of range
I am coming to the conclusion that multiband diffusion does not support stereo models since the dimensions in the tokens tensor are different between mono and stereo models.
It's a bit more complicated it seems. There's code out there that does MBD on stereo but reading it suggests that it converts down to Mono. Although I can't do it at the moment, I'd like to know if we can't just MBD left and right channels separately, or if the results become unusable then.
Ok, this is what enables MBD for stereo from the official code - (concatinating left and right then cutting them back together): https://github.com/facebookresearch/audiocraft/blame/69fea8b290ad1b4b40d28f92d1dfc0ab01dbab85/demos/musicgen_app.py#L144
@AK-uni-git I'm including this fix in my repo.
any updates? I'm facing this issue too.
I'm so sorry, once I updated audiocraft to 1.2.0, this issue was resolved. Thank you.