charger issues

Results 4 issues of


charger

[Disscussion] Can we align GLM-130B to human like chatgpt?

🤔[question] Can I set up a service in determined and expose the port to other clients to call?

### Describe your question I want to create some public api services in determined, but I don't know how to do. 1. how to map the port to the host...

feature

question

求助求助，llama模型生成第一个token时，有3个代码块耗时严重【已解决2个，仅剩1个🙏🙏🙏】

**前言** 感谢开发者，研发出了如此易理解、好部署、配件完善的加速库🎉🎉🎉，真的很棒，对我很有帮助😊😊😊！！！ **问题描述** 有大量业务场景，仅需要模型生成单个token，如：新闻分类、逻辑推断、情感分析、关系提取、语种检测...。在此类场景下，fastllm库中的llama模型实现（其它模型可能也存在）存在一个严重问题：随着batch size增大，耗时线性增长😰。这个问题其他用户也复现了，见issue：[ISSUE 337](https://github.com/ztxz16/fastllm/issues/337) **复现细节** - 硬件：显卡4090，内存cpu管够。 - 模型：LlamaModel - 接口：batch_response ``` import pyfastllm model_path = "tokenizer path" tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code = True) flm_model =...

Add official GPTs

An exceptional project! 🎉 Official GPTs from ChatGPT can be incorporated. Is it possible to categorize and divide into multiple Markdown files? As a large number of GPTs will be...