ltu icon indicating copy to clipboard operation
ltu copied to clipboard

Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".

Results 43 ltu issues
Sort by recently updated
recently updated
newest added

Is there a length limit for input audio in ltu-as? And are there any plans to implement streaming audio input?

I am trying to use the familiar framework to align LLama with other time-series data. But my finetuned model rarely output formatted answer for multi-choices question, therefore it's very difficult...

question

```python # LLaVA if model_args.freeze_backbone: model.model.requires_grad_(False) ``` In LTU code, only note the LLM has already frozen. ```python # for audio params, lora always trainable, llama always frozen for name,...

question