AudioLDM-training-finetuning
AudioLDM-training-finetuning copied to clipboard
Inference code
Thanks for your great work. How to inference with new sample text caption after training?
The project is in active building. I'll add that in later.
The inference code is ready now. Please checkout the main branch
@haoheliu I got this error. How to fix it ?
python3 audioldm_train/infer.py --config_yaml audioldm_train/config/2023_08_23_reproduce_audioldm/audioldm_original_medium.yaml --list_inference tests/captionlist/inference_test.lst
error:
/home/datnt114/Videos/AudioLDM-training-finetuning/audioldm_train/infer.py:125: SyntaxWarning: "is not" with a literal. Did you mean "!="?
if "reload_from_ckpt" is not None:
SEED EVERYTHING TO 0
Global seed set to 0
Add-ons: []
Dataset initialize finished
Reload ckpt specified in the config file audioldm_train/config/2023_08_23_reproduce_audioldm/audioldm_original_medium.yaml
LatentDiffusion: Running in eps-prediction mode
/home/datnt114/anaconda3/lib/python3.11/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3483.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
Some weights of RobertaModel were not initialized from the model checkpoint at roberta-base and are newly initialized: ['roberta.pooler.dense.weight', 'roberta.pooler.dense.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Traceback (most recent call last):
File "/home/datnt114/Videos/AudioLDM-training-finetuning/audioldm_train/infer.py", line 128, in