Daniel comments

Results 101 comments of


                                            Daniel

Can not infer quantized model, but fp32 works well.

But q4_1 works well.

Can not infer quantized model, but fp32 works well.

get a quntized model from this model: multi-qa-MiniLM-L6-cos-v1 on hugging face.

Can not infer quantized model, but fp32 works well.

I modify the code to adapt to BertCode with the latest ggml, it works fine. Maybe it can be solve by upgrading GGML?

Can not infer quantized model, but fp32 works well.

> nan results are typically a sign of some float accuracy weirdness. Do you have a very small model? I think the quantization is less accurate the smaller your model...

Can not infer quantized model, but fp32 works well.

I have pulled a request and the repo owner has merged it. Git pull to get a new version, it works on Windows.

CodeBERT

Are working on the codebert? my email: [email protected]

About the calculation of overhead.

or BERT mode, its overhead is calculated as : model_mem_req += (5 + 16 * n_layer) * 256; // object overhead Can anyone explain the meaning 5 is extra tensors,...

About the calculation of overhead.

thanks for your answer:)

About the calculation of overhead.

I have tested the latest ggml, should alter the 256 to 512. Do not understand why:(

行人重识别这个项目模型转为ONNX在转为NCNN后识别错误。

mobilenetV3 转换后的输出也是错的