ggml GPT Benchmarks

GPT Benchmarks

Open mallorbc opened this issue 1 year ago • 2 comments

GPT models without KV cache have to recalculate values and thus time to compute grows exponentially given a longer input.

Thus, for your benchmarks, how many tokens were generated, and with how many total? Does this support a caching system?

Oct 13 '22 04:10 mallorbc