MiniCPM issues

[Feature Request]: Details about minicpm-4 Eagle3 series models QAT process and quantization framework

### Feature request / 功能建议 Hi, all Can you guys provide more details about minicpm-4 Eagle3 series models QAT process and information about the quantization framework/code repo? It would be...

zhudianGG

feature

[Bad Case]: 微调MiniCPM 4.1-8B时，其loss显著高于其他模型

1

### Description / 描述如图第一步loss为500多，在其他模型刚开始的时候都是3-4左右 ### Case Explaination / 案例解释 _No response_

moonmengmeng

badcase

如何使用ArkInfer？

4

tdtgi

[Bad Case]: Vllm 部署MiniCPM3-8B 报错

2

### Description / 描述 INFO 08-06 08:34:26 [__init__.py:244] Automatically detected platform cuda. INFO 08-06 08:34:31 [api_server.py:1287] vLLM API server version 0.9.1 INFO 08-06 08:34:32 [cli_args.py:309] non-default args: {'model': '/llm/models/MiniCPM4-8B', 'dtype':...

zhanglt

badcase

[Bad Case]: 使用minicpm4-0.5B和8B的模型都报错

1

### Description / 描述你好，请问我在将minicpm4-0.5B和minicpm4-8B的模型下载下来之后，使用范例代码推理： `from transformers import AutoModelForCausalLM, AutoTokenizer` `import torch` `torch.manual_seed(0)` `path = "/mnt/2/haochen/LLM/MiniCPM/pretrained_models/MiniCPM4-0.5B"` `device = "cuda"` `tokenizer = AutoTokenizer.from_pretrained(path)` `model = AutoModelForCausalLM.from_pretrained(path, torch_dtype=torch.bfloat16, device_map=device, trust_remote_code=True)` `responds, history...

Yhc-777

badcase

Modified the usage method for Eagle3 in readme.

LDLINGLINGLING

Training instability with InfLLM v2 on MiniCPM4-8B using sparse attention

6

I'm encountering training instability when attempting to fine-tune MiniCPM4-8B using InfLLM v2 with the provided sparse configuration. The training collapses immediately with gradient norm NaN at the first optimization step...

bolixinyu

[Bad Case]: MiniCPM Intel AIPC Client模型下载完开始对话就提示加载失败要重新下载,重下也不行

1

### Description / 描述 ### Case Explaination / 案例解释 _No response_

omaiyiwa

badcase

[Feature Request]: 能否提供BitCPM 量化前的权重

### Feature request / 功能建议目前BitCPM只提供了量化后的三值权重，能否开源量化前的权重。

Little0o0

feature

[Bug]: LoRA微调时报错：--deepspeed: invalid dict value

1

### Is there an existing issue ? / 是否已有相关的 issue ? - [x] I have searched, and there is no existing issue. / 我已经搜索过了，没有相关的 issue。 ### Describe the bug /...

yt7589

bug

triage

MiniCPM
MiniCPM copied to clipboard

Metadata

[Feature Request]: Details about minicpm-4 Eagle3 series models QAT process and quantization framework

[Bad Case]: 微调MiniCPM 4.1-8B时，其loss显著高于其他模型

如何使用ArkInfer？

[Bad Case]: Vllm 部署MiniCPM3-8B 报错

[Bad Case]: 使用minicpm4-0.5B和8B的模型都报错

Modified the usage method for Eagle3 in readme.

Training instability with InfLLM v2 on MiniCPM4-8B using sparse attention

[Bad Case]: MiniCPM Intel AIPC Client模型下载完开始对话就提示加载失败要重新下载,重下也不行

[Feature Request]: 能否提供BitCPM 量化前的权重

[Bug]: LoRA微调时报错：--deepspeed: invalid dict value

← Metadata

Owner

Metadata

MiniCPM MiniCPM copied to clipboard

Metadata

← Metadata

Owner

Metadata

MiniCPM
MiniCPM copied to clipboard