PQCache
PQCache copied to clipboard
Question about latency test
I want to measure the latency, but the code doesn't seem to provide the nah_input.jsonl. So, how can I use the test_latency.py? Thanks!
I also wanted to do the same and I removed the nah_input.jsonl and generated random inputs instead. However I still encounter errors. Seems that llama2 is not working properly.
Sorry for the late response. The latency evaluation script is outdated. The final latency results presented in paper is based on Mistral-7B-Instruct-v0.2.
We have updated the latency evaluation script and provide demo input sample in test_input.txt. Feel free to contact us if you encounter any further issue.