JaonLiu
JaonLiu
怎们转为 tf2 能够直接使用的模型?求~
@fxmarty the log: ``` Validating ONNX model /share_model_zoo/LLM/openai/onnx_whisper-large-v3/encoder_model.onnx... -[✓] ONNX model output names match reference model (last_hidden_state) - Validating ONNX Model output "last_hidden_state": -[✓] (2, 1500, 1280) matches (2, 1500,...
> @mmingo848 You can use: > > ```shell > optimum-cli export onnx --help > optimum-cli export onnx --model openai/whisper-large-v3 whisper_onnx > ``` > > and then use [ORTModelForSpeechSeq2Seq](https://huggingface.co/docs/optimum/main/en/onnxruntime/package_reference/modeling_ort#optimum.onnxruntime.ORTModelForSpeechSeq2Seq). > >...
> @MrRace You need `--task automatic-speech-recognition-with-past`. There should be a log during the export about it (that specifying `--task automatic-speech-recognition` disables KV cache). @fxmarty Thank you very much for your...
> Yes, this was fixed in #1780, which is not yet in a release. > > Please downgrade to onnx 1.15 or use optimum from source. @fxmarty Thanks a lot,...
> Hi @MrRace, if you don't want to reimplement the inference code from scratch, I advise you to use https://huggingface.co/docs/optimum/main/en/onnxruntime/package_reference/modeling_ort#optimum.onnxruntime.ORTModelForSpeechSeq2Seq. An example is available there. By default, only `encoder_model.onnx` and...
> mlc_config.json @Mawriyo Thanks for your reply. Here is the content of mlc_config.json:: ``` { "model_type": "llama", "quantization": "q4f16_1", "model_config": { "hidden_size": 4096, "intermediate_size": 11008, "num_attention_heads": 32, "num_hidden_layers": 32, "rms_norm_eps":...
@Hzfengsy Which version of Qwen1.5 are you specifically using? Qwen1.5-0.5B-Chat? Or Qwen1.5-1.8B-Chat? Or Qwen1.5-4B-Chat?
same request here!