bark
bark copied to clipboard
Missing Audio
I am trying to set up bark locally currently I am at the point where audio files are getting generated but they are not playable and my media player says the files are corrupted. I also seem to be having a problem where the system isn't recognizing my gpu when I have this code 'audio_array + audio_array.cpu().numpy().squeeze()' it will create the empty audio file but if i do this 'audio_array + audio_array.gpu().numpy().squeeze()' It will give me this error 'File "C:\Users\fmsal\OneDrive\Documents\Bark\Audio_generation.py", line 13, in attention_mask
to obtain reliable results.
Setting pad_token_id
to eos_token_id
:10000 for open-end generation.
Traceback (most recent call last):
File "C:\Users\fmsal\OneDrive\Documents\Bark\Audio_generation.py", line 18, in
I deleted everything and started over I am now able to get audio to play I still seem to be having the issue where my gpu isn't being recognized I am currently trying to resolve this now.
Try audio_arry.to("cuda")....
try my code:
from transformers import AutoProcessor, BarkModel
from datetime import datetime
import scipy
import os
import torch
import time
# Start the clock
start_time = time.time()
# Check if GPU is available
if torch.cuda.is_available():
device = torch.device("cuda")
print("Using GPU:", torch.cuda.get_device_name())
else:
device = torch.device("cpu")
print("Using CPU")
# Settings (If you need them)
os.environ["SUNO_OFFLOAD_CPU"] = "True"
os.environ["SUNO_USE_SMALL_MODELS"] = "True"
# Load processor and model
processor = AutoProcessor.from_pretrained("suno/bark")
model = BarkModel.from_pretrained("suno/bark")
# Move model to the device (CPU or GPU)
model.to(device)
voice_preset = "v2/en_speaker_6"
# Process text input
inputs = processor("The James Webb Space Telescope has captured stunning images of the Whirlpool spiral galaxy, located 27 million light-years away from Earth.", voice_preset=voice_preset)
# Move inputs to the same device as the model
for key in inputs.keys():
inputs[key] = inputs[key].to(device)
# Generate audio
audio_array = model.generate(**inputs)
audio_array = audio_array.cpu().numpy().squeeze()
# Get the current date and time
now = datetime.now()
# Format the date and time as a string: YYYYMMDD_HHMMSS
timestamp_str = now.strftime("%Y%m%d_%H%M%S")
# Use the timestamp as part of the filename
filename = f"bark_out_{timestamp_str}.wav"
# Save as WAV file
sample_rate = model.generation_config.sample_rate
scipy.io.wavfile.write(filename, rate=sample_rate, data=audio_array)
# Stop the clock and print the elapsed time
end_time = time.time()
elapsed_time = end_time - start_time
print(f"The process took {elapsed_time:.2f} seconds.")