iv2985

Results 5 comments of iv2985

I had to explicitly tell the model to use the GPU with `device_map` and `.to(model.device)` for the processor ``` model_name = 'suno/bark' wav_processor = AutoProcessor.from_pretrained(model_name) wav_model = BarkModel.from_pretrained(model_name, device_map='cuda', torch_dtype=torch.float32)...

To anyone curious, they are trying to sell a service API for $0.50/min...

If one takes the G_0.pth (the first checkpoint) during training and uses it for inference, it speaks English with a young female voice that doesn't match the audio clips being...

if you're trying to divide up and transcribe audio, use whisper. if you want to divide up text, use nltk.