Add session-level metrics for NexusSessionResult telemetry tracking
Problem
We needed session-level metrics in LumberEventName.NexusSessionResult telemetry to track the total number of operations and signals emitted by all clients during a collaboration session. The existing implementation only provided individual client metrics without session-level aggregation.
Solution
This PR adds session-level counters that aggregate operations and signals across all clients in a session and includes them in NexusSessionResult telemetry when handleClientSessionTimeout is called.
Key Changes
- Extended session telemetry properties: Added
sessionOpCountandsessionSignalCounttoICollaborationSessionTelemetryProperties - Session-level aggregation: Modified
CollaborationSessionTrackerto accumulate client metrics when clients disconnect - NexusSessionResult integration: Enhanced
handleClientSessionTimeoutto include session counts usingCommonPropertiesenum values - Efficient Redis usage: Session metrics are updated only on client disconnect, not per operation/signal
- Memory management: Automatic cleanup with session lifecycle
Implementation Flow
- Individual clients track ops/signals in nexus layer (
sessionOpCountMap/sessionSignalCountMap) - On disconnect, client counts are passed to
endClientSession() - Session tracker accumulates counts in session telemetry properties
- When session times out,
handleClientSessionTimeoutincludes aggregated counts inNexusSessionResult
Example Output
When a collaboration session ends, NexusSessionResult telemetry will include:
{
"documentId": "doc-456",
"tenantId": "tenant-789",
"sessionOpCount": 42,
"sessionSignalCount": 15,
"durationInMs": 125000,
"maxConcurrentClients": 3
}
This provides precise visibility into total collaboration activity per session, enabling better monitoring and analytics while maintaining performance by avoiding frequent Redis updates.
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.
@copilot
The session I mean is LumberEventName.NexusSessionResult. The file context for this change should involve to following files and others might related.:
- packages/services/src/sessionTracker.ts
- packages/lambdas/src/nexus/index.ts
When handleClientSessionTimeout get called, I want the ops and signal counters to be recoreded along with other metrics properties of LumberEventName.NexusSessionResult. I am pretty sure you don't need change the server/routerlicious/packages/lambdas/src/utils/telemetryHelper.ts file.
We want to keep a running tally to increase the counter for event handler "submitOp" and "submitSignal" for a given socket connection, then add that to the session tally on disconnect or at regular intervals. Importantly, we don't want to update the Redis session data for every Signal. Especially if the Presence package becomes more heavily utilized, we would see too high of a Redis update increase. Also, there would be a high chance for collisions, and we'd want to use Redis INCR in that case
@copilot
The session I mean is
LumberEventName.NexusSessionResult. The file context for this change should involve to following files and others might related.: ...
I've implemented the session-level metrics integration as requested. The changes ensure that when handleClientSessionTimeout is called, the LumberEventName.NexusSessionResult telemetry will include the accumulated sessionOpCount and sessionSignalCount from all clients in the session.
Key changes made:
- Extended
ICollaborationSessionTelemetryPropertiesto include session-level counts - Modified session tracker to accumulate client metrics when
endClientSessionis called - Updated disconnect logic to pass individual client counts to the session tracker
- Session metrics are stored in Redis only on disconnect, not per operation/signal (as requested)
- Reverted unneeded
telemetryHelper.tschanges per your feedback
The implementation tracks ops/signals at the nexus layer per client, then aggregates them at the session level when clients disconnect. When session timeout occurs, these counts are automatically included in the NexusSessionResult telemetry via the session's telemetry properties.
Commit: c848bd4