MiniCPM issues

[Bug]: return_dict=False is not working in minicpm3

### Is there an existing issue ? / 是否已有相关的 issue ? - [X] I have searched, and there is no existing issue. / 我已经搜索过了，没有相关的 issue。 ### Describe the bug /...

TianmengChen

bug

triage

[Bad Case]: error loading model architecture: unknown model architecture: 'minicpm3' time=2024-10-12T20:12:01.292+08:00 level=ERROR source=sched.go:456 msg="error loading llama server" error="llama runner process has terminated: this model is not supported by your version of Ollama. You may need to upgrade"

3

### Description / 描述 error loading model architecture: unknown model architecture: 'minicpm3' time=2024-10-12T20:12:01.292+08:00 level=ERROR source=sched.go:456 msg="error loading llama server" error="llama runner process has terminated: this model is not supported by...

Hoyxxx

badcase

[Feature Request]: 建议将finetune里的.sh文件中的--include localhost:1更改为--include localhost:0

1

### Feature request / 功能建议大部分人在微调2b模型用的是单卡，建议默认参数为--include localhost:0

wsstudent

feature

[Feature Request]: 如果使用CPU的话，要求的CPU核数和内存大概是多大，运行4B的模型。

1

### Feature request / 功能建议如果使用CPU的话，要求的CPU核数和内存大概是多大，运行4B的模型。

zaojiahua

feature

[Feature Request]: 请问后续是否会开源三值模型的量化感知训练代码。

### Feature request / 功能建议后续有开源Minicpm量化模型相关代码的计划吗。

eureka336

feature

[Question] Any plan for MiniCPMForTokenClassification?

Any plan for MiniCPMForTokenClassification?

Evilran

[Bug]: 使用llama.cpp部署报错

### Is there an existing issue ? / 是否已有相关的 issue ? - [x] I have searched, and there is no existing issue. / 我已经搜索过了，没有相关的 issue。 ### Describe the bug /...

GladiousZhang

bug

triage

junyou2001

badcase

MiniCPM
MiniCPM copied to clipboard

Metadata

[Bug]: return_dict=False is not working in minicpm3

[Feature Request]: 建议将finetune里的.sh文件中的--include localhost:1更改为--include localhost:0

[Feature Request]: 如果使用CPU的话，要求的CPU核数和内存大概是多大，运行4B的模型。

[Feature Request]: 请问后续是否会开源三值模型的量化感知训练代码。

[Question] Any plan for MiniCPMForTokenClassification?

[Bug]: 使用llama.cpp部署报错

[Question] Model Parallel when Training MiniCPM4-8B

sparse attention torch实现

[Bad Case]: 使用minicpm4-0.5B同时启用sparse attention时报错

← Metadata

Owner

Metadata

MiniCPM MiniCPM copied to clipboard

Metadata

← Metadata

Owner

Metadata

MiniCPM
MiniCPM copied to clipboard