ltu icon indicating copy to clipboard operation
ltu copied to clipboard

Missing Tokenize Audio Info during Fine-tuning/Training

Open dingdongwang opened this issue 1 year ago • 1 comments

It seems missing the tokenize the audio (from 'input_ids') step both in finetune.py/finetune_low_resource.py of the LTU repo. Where is the detailed coding step for audio tokenization? I saw the 'load_audio()' function in inference_batch.py.

dingdongwang avatar Jan 24 '24 14:01 dingdongwang

They are in https://github.com/YuanGongND/ltu/blob/0fa0923f9c9d04346486a28477ba69b7d957130c/src/ltu/hf-dev/transformers-main/src/transformers/data/data_collator.py#L615-L616 (similar path for LTU-AS).

They cannot be in finetune.py/finetune_low_resource.py because they have to be loaded on-the-fly otherwise there will be an OOM (we cannot put all audios in memory).

-Yuan

YuanGongND avatar Jan 24 '24 19:01 YuanGongND