Feature Request: Add ROCm Support for AMD GPUs or OpenCL Support for Integrated Graphics Acceleration Description
First of all, thanks for creating such a useful tool. It's been really helpful for a lot of us!
I wanted to bring up something that would make the software even better for users like me who rely on AMD hardware. Currently, the software supports CUDA for GPU acceleration, which is great for NVIDIA users. However, it would be fantastic if we could also have support for ROCm or OpenCL to take advantage of AMD GPUs or integrated graphics.
What I'm suggesting: ROCM Support: Adding support for ROCm, AMD’s open-source platform for GPU computing, would allow AMD GPU owners to benefit from GPU acceleration within the software. OpenCL for Integrated Graphics: Similar to how some other tools handle it (like UVR), supporting OpenCL would enable the use of integrated graphics for acceleration, which is particularly beneficial for users with AMD APUs. Why this would be great: Improved Performance: Leveraging AMD GPUs or integrated graphics could lead to faster processing times. Broader Compatibility: This change would cater to a wider range of hardware setups, making the tool more accessible. I understand that adding new features takes time and effort, but I believe these additions could significantly enhance the user experience for those using AMD hardware. I hope this feature can be considered in future updates.
This project uses torch library, so I think you can use ROCM if you install torch with ROCM support. Check here: https://pytorch.org/
This project uses torch library, so I think you can use ROCM if you install torch with ROCM support. Check here: https://pytorch.org/
Do you think it would be feasible to use OpenCL for acceleration with integrated graphics? Intel GPUs currently do not support ROCm or CUDA. Adding OpenCL support would enable acceleration using both AMD and Intel integrated graphics, as well as Intel dedicated GPUs. This is how UVR handles it.
I think OpenCL is also possible with this pytorch fork: https://github.com/artyom-beilis/pytorch_dlprim
May be some changes in code of this repository needed related to keyword 'cuda'. But I personally can't check it.
Actually UVR always used DirectML instead of OpenCL. It was Anjok naming mistake corrected in newer beta Roformer patches.
I think it can be used with this repo with minimum changes: https://learn.microsoft.com/en-us/windows/ai/directml/pytorch-windows
Hi I tried using pytorch-windows and changed a few lines in inference.py like this:
import torch_directml
...
def proc_folder(args):
...
if torch_directml.is_available():
print('DirectML is available, use --force_cpu to disable it.')
device = torch_directml.device(args.device_ids[0]) if type(args.device_ids) == list else torch_directml.device(args.device_ids)
device_name = torch_directml.device_name(args.device_ids[0]) if type(args.device_ids) == list else torch_directml.device_name(args.device_ids)
but I got this error:
DirectML is available, use --force_cpu to disable it.
Using device: AMD Radeon RX 6800
Start from checkpoint: E:\Music-Source-Separation-Training-main\checkpoints\MelBandRoformer.ckpt
Instruments: ['vocals', 'other']
Model load time: 1.70 sec
Total files found: 1. Using sample rate: 44100
Processing track: E:\input\test.flac
Processing audio chunks: 0%| | 0/8037372 [00:00<?, ?it/s]
[F1228 19:15:36.000000000 dml_util.cc:118] Invalid or unsupported data type ComplexFloat.
Process failed with return code 3221226505
I came across this page that says that complex isn't supported in DirectML. Any ideas how it might be possible to work around this?
May be it's possible but for this we need to change stft conversion of data inside MelRoformer model to avoid Complex numbers. I'm not sure if it's easy to do.
May be it's possible but for this we need to change stft conversion of data inside MelRoformer model to avoid Complex numbers. I'm not sure if it's easy to do.
Maybe this repository would help, but I currently get no time to think about it.
Maybe you could find how Anjok handles Roformers using DirectML in the UVR's code: https://github.com/Anjok07/ultimatevocalremovergui/tree/v5.6.0_roformer_add
niedz., 29 gru 2024 o 12:05 KitsuneX07 @.***> napisał(a):
May be it's possible but for this we need to change stft conversion of data inside MelRoformer model to avoid Complex numbers. I'm not sure if it's easy to do.
Maybe this repository https://github.com/DakeQQ/STFT-ISTFT-ONNX would help, but I currently get no time to think about it.
— Reply to this email directly, view it on GitHub https://github.com/ZFTurbo/Music-Source-Separation-Training/issues/107#issuecomment-2564688096, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIJ3EHDVBTRDOFBJSQSINPL2H7JOLAVCNFSM6AAAAABTF5QS7WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNRUGY4DQMBZGY . You are receiving this because you commented.Message ID: @.*** com>
I am trying to modify inference.py to adapt to DirectML for inference, but I do not understand how DirectML works. There may be other areas that need modification and adaptation beyond just inference.py, so I might need some time to research and trial-and-error. I may not be able to produce a decent version. If anyone has ideas, they can also try modifying it; no need to wait for me.
I also noticed that the dev developers are trying to add DirectML support, and anjok's approach is worth discussing. As the saying goes, "one generation does the hard work, and the next generation benefits from it." I will try to research this together with my friends. (Apologies, I accidentally sent my previous two comments before finishing them, which may lead to duplicate messages.)
If needed UVR's beta with roformers and directML code is available here https://github.com/Anjok07/ultimatevocalremovergui/tree/v5.6.0_roformer_add%2Bdirectml
Do you know whether this branch can work on Linux? edit. From what people wrote, yes.
pon., 30 gru 2024 o 05:30 Jarredou @.***> napisał(a):
If needed UVR's beta with roformers and directML code is available here https://github.com/Anjok07/ultimatevocalremovergui/tree/v5.6.0_roformer_add%2Bdirectml
— Reply to this email directly, view it on GitHub https://github.com/ZFTurbo/Music-Source-Separation-Training/issues/107#issuecomment-2565014546, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIJ3EHDCLZAYSWFY7C3C2FD2IDD5ZAVCNFSM6AAAAABTF5QS7WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNRVGAYTINJUGY . You are receiving this because you commented.Message ID: @.*** com>
@aqst
In roformer code you can try to change these lines:
stft_repr = torch.stft(raw_audio, **self.stft_kwargs, window=stft_window, return_complex=True)
stft_repr = torch.view_as_real(stft_repr)
On this:
stft_repr = torch.stft(raw_audio, **self.stft_kwargs, window=stft_window, return_complex=False)
It's equal and you will avoid complex64 tensor type.
I tried changing that in mel_band_roformer.py but unfortunately I got the same error.
I also saw line 487 of bs_roformer.py and I tried something similar to run that part on the CPU:
stft_repr = torch.stft(raw_audio.cpu(), **self.stft_kwargs, window=stft_window.cpu(), return_complex=True)
stft_repr = torch.view_as_real(stft_repr).to(device)
but then I got this error:
File "E:\Music-Source-Separation-Training-main\models\bs_roformer\mel_band_roformer.py", line 531, in forward
x = stft_repr[batch_arange, self.freq_indices]
~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
IndexError: shape mismatch: indexing tensors could not be broadcast together with shapes [4, 1], [3958]
I'm not sure what that means but maybe there's some kind of issue in DirectML
Hi I tried using pytorch-windows and changed a few lines in inference.py like this:
import torch_directml ... def proc_folder(args): ... if torch_directml.is_available(): print('DirectML is available, use --force_cpu to disable it.') device = torch_directml.device(args.device_ids[0]) if type(args.device_ids) == list else torch_directml.device(args.device_ids) device_name = torch_directml.device_name(args.device_ids[0]) if type(args.device_ids) == list else torch_directml.device_name(args.device_ids)but I got this error:
DirectML is available, use --force_cpu to disable it. Using device: AMD Radeon RX 6800 Start from checkpoint: E:\Music-Source-Separation-Training-main\checkpoints\MelBandRoformer.ckpt Instruments: ['vocals', 'other'] Model load time: 1.70 sec Total files found: 1. Using sample rate: 44100 Processing track: E:\input\test.flac Processing audio chunks: 0%| | 0/8037372 [00:00<?, ?it/s] [F1228 19:15:36.000000000 dml_util.cc:118] Invalid or unsupported data type ComplexFloat. Process failed with return code 3221226505I came across this page that says that complex isn't supported in DirectML. Any ideas how it might be possible to work around this?
Can confirm this issue still exists now. I am using torch-directml with torch 2.4.1 and the latest Melband Roformer. I used this to initialize directML:
dml_device_name = "privateuseone:0"
# Initialize determined_device, defaulting to CPU
determined_device = torch.device("cpu")
print("Attempting to set up processing device...")
try:
print(f"Testing DirectML device: {dml_device_name}...")
# Test if DirectML is available and functional with a small tensor
_ = torch.tensor([1.0]).to(dml_device_name)
# If the above line didn't raise an error, DirectML is working
determined_device = torch.device(dml_device_name)
print(f"DirectML device {determined_device} is available. Attempting to move model to this device.")
model = model.to(determined_device) # Move the model
# Correctly check the device of the model's parameters
if list(model.parameters()): # Check if model has parameters
actual_model_device = next(model.parameters()).device
print(f"Model parameters are now on device: {actual_model_device}")
if actual_model_device.type != 'privateuseone':
# This case should ideally not happen if .to(determined_device) was successful
print(f"Warning: Model parameters are on {actual_model_device} despite targeting {determined_device}. Check model's .to() implementation.")
# Fallback if critical
raise RuntimeError(f"Model moved to {actual_model_device} instead of {determined_device}")
else:
print("Model has no parameters. The .to(device) call was made.")
print(f"Successfully set target device to DirectML: {determined_device}")
except Exception as e:
print(f"An error occurred during DirectML setup or model transfer. Error: {e}")
# If the error was the AttributeError, DirectML init was likely fine, but the check was wrong.
# If it was another error (like from the tensor test), DirectML itself might have an issue.
import traceback
traceback.print_exc() # Print the full traceback to see the exact error
print(f"Falling back to CPU.")
determined_device = torch.device("cpu") # Ensure determined_device is CPU
if 'model' in locals() and model is not None: # Check if model is defined
model = model.to(determined_device) # Move model to CPU
else:
print("Model was not defined before attempting to move to CPU (should not happen here).")
print(f"--- Proceeding with inference on device: {determined_device} ---")
# Ensure you use 'determined_device' in your run_folder and subsequent tensor operations
run_folder(model, args, config, determined_device, verbose=False)
Did anybody manage to make Melband Roformer work with DirectML?
Just to let you know, someone added partial MPS support to BS/Mel-Roformer. It's already 2x faster than CPU. Maybe it might get useful. https://github.com/axeldelafosse/BS-RoFormer/
