daniel-salib

Results 3 issues of daniel-salib

accurate realtime request concurrency tracking. Added the /load api to retrieve the realtime concurrency count. Benchmark latency_results.json with the load tracking: ` { "avg_latency": 0.21240155203461958, "latencies": [ 0.21297942200908437, 0.2120011480001267, 0.2135092600074131,...

frontend

the background task to decrement server_load can only trigger if there's a response. If the connection is terminated (i.e. canceled or timeout), then we need to ensure server load is...

frontend

## Purpose This change enables streaming support for MCP tools when using GPT OSS. It extends the harmony utilities and response serving infrastructure to handle tool streaming, allowing tool calls...

frontend
gpt-oss