Erwin comments

Results 15 comments of


                                            Erwin

使用自定义数据集，loss_bbox: 0.00 loss_dfl: 0.00，网上的方法，添加类别，修改base搜没有解决。

参考 #337

how to convert onnx

onnx & TensorRT : https://github.com/wingdzero/GroundingDINO-TensorRT-and-ONNX-Inference?login=from_csdn

whisper-asr-large-size 模型支持,中/英之外的其他语种支持

> > * examples/whisper/export_onnx.md 里写不支持large-size, 请问是芯片本身不支持某些维度的NN,还是这个转换脚本不支持large? > > * examples/whisper/python/whisper.py 必须指定task为 en 或zh, 但whisper是有语种识别性能,应该是可以直接输入音频,自动判断音频语种,然后输出对应语种的转写结果。这个是芯片本身不支持吗?还是脚本不支持,需要开发者自己调整脚本?请问有做过这方面的尝试吗? > > 1.large理论上是支持的，因为模型太大，能不能跑起来还得取决于板子的硬件情况，如果有兴趣可以自行导出large模型试试，不过记得要修改c demo中的ENCODER_OUTPUT_SIZE参数 2.可以支持先识别语种后在进行转换，这个可以需要自己修改下推理逻辑，因为我们目标是提供语音识别的 demo，所以没有做其他任务你好，请问你知道日文的tokenizer如何提取吗？我如何尝试保存的tokenizer都是英文的。 ``` tokenizer = WhisperTokenizer.from_pretrained("openai/whisper-small", language="japanese", task="transcribe") tokenizer.set_prefix_tokens(language="japanese", task="transcribe", predict_timestamps=False)...

YOLO11-pose detection, low point confidence at the image edge

The image edge distortion is severe, and the target becomes smaller. The difficulty of detection has increased. A large amount of real data that appears on distorted edge targets needs...

关于语音zipformer模型的支持

> > 我这边尝试了kws-zipformer-wenetspeech-3.3M-2024-01-01这个模型 > > 需要用 rknntoolkit 2.2，才能转 [@chris1992212](https://github.com/chris1992212) Kuang哥，请问你们开源的kws-zipformer轻量级模型有支持日语的吗？我找了一些，全是英文+汉语的。

zipformer onnx转rknn失败

瑞芯微就没有一个人来解答一下吗？ RK的售后真的是太垃圾了有问题不给解决，以后谁还买你家芯片。

make it compatible with torch 2.6.0

torch 2.6.0 + CUDA12.4 Thanks a lot

Whisper的示例里提供的预训练模型不支持中文吗

> whisper中文识别任务目前已经支持你好，请问我下载的官方的tiny模型，想进行日语的ASR任务，如何生成vocab.txt文件啊？是不是替换了这个文件，就可以进行日语的识别了？

Erwin