Dinghao Zhou comments

Results 152 comments of


                                            Dinghao Zhou

IO这块 https://github.com/wenet-e2e/wenet/issues/2107 基于tf data的io虽然性能很好，但是对于使用者比较重（难），而且大模型领域用tfdata的基本是google等tensorflw/jax项目 huggingface dataset 值得试一试， tf的一些特性比如 map filter shuffle 都支持，而且很多fintune库也用到了之后会尝试用下写一版看性能怎么样 ref: - https://huggingface.co/docs/datasets/stream - https://github.com/huggingface/datasets/issues/5317#issuecomment-1333752503 - https://huggingface.co/blog/audio-datasets

LoRA support

可以通过继承的方式，比如loara_conformer_encoder, lora_attention, 重写encoder 和attention，目录如下： - wenet/fintune/lora/encoder.py - wenet/fintune/lora/attention.py 然后在init_model.py 里边初始化，这样对原始代码几乎无侵入，并且fintue的方式还有lora变种 adapter等方式，可以方便后续扩展

LoRA support

赞，过两天看一下

Whisper-large-v3训练后的CTC解码结果出现�字符

> > > 谢谢大佬！whisper的多语种训练是近期会支持吗 > > > > 邀请你给wenet贡献一下librispeech+aishell的训练recipe，把我留的这个TODO解决了 https://github.com/wenet-e2e/wenet/blob/main/wenet/whisper/whisper.py#L69-L77 ，目前是写死的中文，改成可配置的话，多语种训练就解决了 > > 谢谢！那我自己试一下，让txt带着语种id进add_whisper_tokens或者其他的方法 > > dataset里改一下加个task 和language的id

[feats] mdoel Parallelism + pipeline Parallelism

TODO - [ ] whisper extra large 8B TP demo - [ ] fsdp - [ ] Tp - [ ] shard and reshard https://pytorch.org/docs/stable/distributed.checkpoint.html - [ ] async checkpoint

sdpa about

Pytorch natively supports flash attention through sdpa, we do not need this line ‘with torch.backends.cuda.sdp_kernel(enable_math=False):’, pytorch will automatically select flash attention or memory efficenet attention you can check https://pytorch.org/docs/stable/generated/torch.nn.functional.scaled_dot_product_attention.html and...

Dinghao Zhou

中文开源语音大模型计划

中文开源语音大模型计划

LoRA support

LoRA support

Whisper-large-v3训练后的CTC解码结果出现�字符

[feats] mdoel Parallelism + pipeline Parallelism

sdpa about

sdpa about

［feats/llm］语音大模型背景下的llm集成

［feats/llm］语音大模型背景下的llm集成