FastDeploy
FastDeploy copied to clipboard
[LLM] refine llm infer
PR types(PR类型)
Description
- decouple docker and source files
- add basic tests and benchmark
- refine triton and engine module