FunASR
FunASR copied to clipboard
any support for fine-tune audio data longer than 1 minute?
What is your question?
For finetuning my model, should I prepare audio data less than 15s? I have lots of audios longer than 1 minute, should I split them manually, or there are other convenient ways? Can I use the vad model during fine-tune process?
What's your environment?
- OS (Linux):
- FunASR Version (1.0.0):
- ModelScope Version (1.11.0):
- PyTorch Version (2.0.0):
- How you installed funasr (
pip): - Python version:
- GPU (4090)
- CUDA/cuDNN version (cuda11.7):
- Docker version (funasr-runtime-sdk-cpu-0.4.1)