zzlTim

Results 4 issues of zzlTim

### Describe your problem I use Xinference (or ollama) to deploy local llm models I can download glm4-chat-1m from Xinference, or local llm custom-glm4-chat and I can enter the UI...

question

### System Info / 系統信息 >>> print(torch.__version__) 2.3.1+cu121 >>> print(torch.version.cuda) 12.1 >>> print(torch.backends.cudnn.version()) 8902 >>> print(transformers.__version__) 4.43.2 >>> ### Who can help? / 谁可以帮助到您? @abmfy python api_server.py Setting eos_token is...

### Is there an existing issue for the same bug? - [x] I have checked the existing issues. ### Branch name main ### Commit ID i have pulled the newest...

bug

### System Info / 系統信息 cuda 12.1 python 3.10.1 ### Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece? - [ ] docker / docker - [X] pip install /...

gpu