ModelCenter icon indicating copy to clipboard operation
ModelCenter copied to clipboard

Efficient, Low-Resource, Distributed transformer implementation based on BMTrain

Results 11 ModelCenter issues
Sort by recently updated
recently updated
newest added

Hi developers, There is a misspelling in the line 138 of file cpm1.py in the following link: [cpm1.py](https://github.com/OpenBMB/ModelCenter/blob/bad1193d1871770b29044ab691b0d99c1cea07cf/model_center/model/cpm1.py#L138) 'Ture' should be 'True' Best

When doing structured pruning, sometimes we need to apply the same mask before or after different modules if they have the same input or output space. Say, if we are...

```python from model_center.layer import CPM1 CPM1.from_pretrained("cpm1-large") ``` currently could not work since the function `check_web_and_convert_path` calls `bmt.rank()` or `bmt.print_rank()` to prevent every process downloads the checkpoint in a multi-gpu scenario....

for CPM1 CPM2 Bert GPT2 GPTJ T5 and corresponding return datatype

**Describe the bug** When I run the start up code in README.md, in step 4 "Train the model" I can't properly run the code. Google colab reported "TypeError: linear(): argument...

I followed the Quick Start and in step 3, when I copied the code to Google Colab and try to run it, I encountered "KeyError: 'label'". I found that there...

https://github.com/OpenBMB/ModelCenter/blob/main/examples/cpm2/pretrain_cpm2.py#L24 请问这里模型初始化是不是每卡都会执行? 如果模型很大,可能内存OOM。谢谢您的解答。

**Describe the bug** I used a verified LLaMA 7B hg checkpoint, and used a single thread bmb to do inference. But the output are just random gibberish. Not sure why?...

**Describe the bug** Building prefix dict from the default dictionary ... Loading model from cache /tmp/jieba.cache Building prefix dict from the default dictionary ... Loading model from cache /tmp/jieba.cache Loading...