audiolm-pytorch
audiolm-pytorch copied to clipboard
Trying to overfit SounsStream
I am trying to overfit SoundStream on 10 samples from a common voice dataset to validate and train on the whole dataset. But the output is just noise! I do not know what the problem is. I also tried to overfit on only 1 sample.
This is the code I use:
soundstream = SoundStream(
codebook_size = 1024,
rq_num_quantizers = 8,
rq_groups = 2,
use_lookup_free_quantizer = True,
use_finite_scalar_quantizer = False,
attn_window_size = 128,
attn_depth = 2
)
trainer = SoundStreamTrainer(
soundstream,
folder = "/data/data/",
batch_size = 1,
grad_accum_every = 2,
data_max_length_seconds = 2,
save_results_every = 1,
save_model_every = 4,
num_train_steps = 1_000
).cuda()
trainer.train()
The text file for the losses during the training for the first 500+ steps:
audiolm_pytorch_demo.txt
I am using the last version 2.0.7, Python 3.10.4.