cortex.cpp icon indicating copy to clipboard operation
cortex.cpp copied to clipboard

epic: cortex.cpp benchmark + Backend Infra

Open nguyenhoangthuan99 opened this issue 1 year ago • 0 comments

Currently the example server for cortex.llamacpp and cortex.tensorrtllm can get the following resuls: With avg contex length 400:

  • cortex.llamacpp: 850 token/s
  • cortex.tensorrt-llm: 1450 token/s

We need to benchmark cortex-cpp server and make sure performance of cortex-cpp corresponding to example server

nguyenhoangthuan99 avatar Aug 07 '24 02:08 nguyenhoangthuan99