Jiang Long issues

Repositories
Issues
Comments

Results 2 issues of


                                            Jiang Long

how to limit token

![image](https://github.com/langchain-ai/opengpts/assets/22496486/3229ad64-da66-421f-b07b-1ce4eaa6c2be) how to limit gpt3-turbo token use?

Do LLM Cache Support V100 hardware?

I using V100 gpu to testing deploy Distributed KV Cache exmaple, unfortunately it's failed, because requires flash attention backend. ![Image](https://github.com/user-attachments/assets/997a8957-dd17-46fc-95b8-f4bc5e32356f)

kind/support

area/kv-cache