vllm icon indicating copy to clipboard operation
vllm copied to clipboard

[V1][Metrics] add support for kv event publishing

Open alec-flowers opened this issue 7 months ago • 25 comments

RFC: KVBlocks and Metrics Publishing In Inference Frameworks

  • Added KVCacheEvent, BlockStored, BlockRemoved, and AllBlocksCleared msgspec classes
  • Created a queue in the BlockPool and write these events in the appropriate functions
  • Bubble the events up to the scheduler where they are appended to EngineCoreOutputs
  • Add kv_cach_events to EngineCoreOutputs
  • Wrote unit tests at the BlockManager level to test basic functionality and at the EngineCore level testing correct propagation and serializing over zmq.

API

  • add enable_kv_cache_events to engineArgs ~- add external_stat_loggers field to AsyncLLM API~ Covered by https://github.com/vllm-project/vllm/pull/14661

With https://github.com/vllm-project/vllm/pull/14661 and this PR a 3rd party can write a custom stat logger to consume both engine Stats and Events and publish them elsewhere.

alec-flowers avatar Apr 17 '25 02:04 alec-flowers