ppppppppig comments

Results 5 comments of


                                            ppppppppig

请问chatglm6b，glm10b和glm130b模型到底有哪里不同的

> 130B一般人就玩不动了，至少双A100 是啊，我这边主要想调研下，是否能够根据[THUDM/FasterTransformer](https://github.com/THUDM/FasterTransformer)改进一份fastertransformer的代码，让fastertransformer能够支持跑GLM10B这些模型。所以前期得关注下GLM10B和GLM130B模型差别在哪里，差别大不大。

请问chatglm6b，glm10b和glm130b模型到底有哪里不同的

> 6B,10B,130B是参数量为60亿、100亿、1300亿，一般来说参数量越多脑容量越大。是否+chat我理解是代表有没有经过中文QA和对话数据集的训练。是的，但是他们的模型结构也有调整，必须弄清楚调整了哪里，才能在FasterTransformer进行对应的调整。

Support for chatglm-6B/GLM models?

Same request

When hot-loading a large model, a segmentation fault will occur.

![image](https://github.com/triton-inference-server/fastertransformer_backend/assets/22321889/30da6bcd-7ef9-4de2-8a21-36fa5051fbc4)

when fastertransformer support continuous batching and PagedAttention ?

> Based on FasterTransformer, we have implemented an efficient inference engine - [TurboMind](https://github.com/InternLM/lmdeploy#introduction) > > * It supports llama and llama-2 > * It modeled the inference of a conversational...