Thibault Castells comments

Results 9 comments of


                                            Thibault Castells

LCM scheduler

Oh that's great, I did not find any implementation when I looked for it. Thank you!

VAE training sample script

Hello, This project is really cool, thank you! I noticed a potential mistake in the code: the kl loss is applied on the output, but I think it should be...

VAE training sample script

@Pie31415 thank you very much! I will let you know if I have other improvement suggestions

VAE training sample script

By the way: > However using it gives me bad results, I think it is because it changes too much the latent space organization (in the end I use it...

No I meant the coefficient that multiplies the loss term (`kl_scale`): `loss = mse_loss + args.lpips_scale * lpips_loss + args.kl_scale * kl_loss` Note that by default `kl_scale` and `lpips_scale` are...

VAE training sample script

@Pie31415 I am not too surprised that this issue happens when using only the mse loss, because this is a very different training configuration than in the paper, so we...

CLAP checkpoint

@haoheliu could you let me know if releasing this checkpoint would be possible? Thank you!

CLAP checkpoint

Thank you for the answer! Just to check: do you have a checkpoint that is compatible with the HuggingFace AudioLDM pipeline, which uses the transformers.ClapAudioModelWithProjection class? I tried to load...

Improvement suggestions

Thank you for this quick answer! > How long it takes to generate the FAD depends on, e.g. CPU / GPU, length & amount of audio. IMO there isn't much...