Results 4 issues of Jingyuan

### 🐛 Describe the bug requestTrace currently only reports data when requests are not using steam. It should work in for both stream and non-stream vLLm requests. ### Steps to...

### Summary Having access to the GPU profile used by the GPU optimizer, we propose to add a new routing policy that utilizes performance profiles per input/output token pattern to...

kind/enhancement
priority/important-soon
area/heterogeneous

## Pull Request Description Refactoring for cache: 1. Merge multiple pod, model, and metric mapping by adding Pod metadata and Model metadata and using two main thread-safe registries for metadata....

### Summary Refactoring for cache: 1. Merge multiple pod, model, and metric mapping by adding Pod metadata and Model metadata and using two main thread-safe registries for metadatas. 2. Eliminate...

area/gateway
priority/critical-urgent
area/heterogeneous