Implement error node caching for improved TreeView user experience
Update
https://github.com/microsoft/vscode-cosmosdb/pull/2706#issuecomment-2958830169
Overview
This PR implements error node caching functionality to significantly improve user experience when dealing with failed tree nodes in the Azure Databases extension. Previously, when nodes failed due to authentication issues or connectivity problems, every tree refresh would retry the failed operation, causing delays and poor UX when multiple nodes were affected simultaneously.
Implementation
Core Changes
Enhanced BaseCachedBranchDataProvider:
- Added
errorNodeCacheMap to store failed node states by parent ID - Modified
getChildren()to check error cache first, preventing repeated failed operations - Added
resetNodeErrorState()method to clear error states for retry functionality - Enhanced cache cleanup in
refresh()andpruneCache()methods - Creates user-friendly error messages with retry buttons when operations fail
New Retry Command:
- Created
retryAuthenticationcommand that clears error state and refreshes nodes - Smart provider detection automatically determines which branch data provider to use
- Handles both direct element retry and retry button clicks
- Registered in command system as
azureDatabases.retryAuthentication
Utility Functions:
-
hasRetryNode(): Detects error nodes by checking for IDs ending with '/reconnect' -
createGenericElementWithContext(): Creates tree elements with custom context values - Comprehensive test coverage with 14 test cases
Automatic Benefits for Target Providers
Both target providers automatically inherit the error caching functionality:
-
src/tree/azure-resources-view/cosmosdb/CosmosDBBranchDataProvider.ts -
src/tree/workspace-view/cosmosdb/CosmosDBWorkspaceBranchDataProvider.ts
No changes were needed to these files since they extend the enhanced BaseCachedBranchDataProvider.
User Experience Improvements
Before:
- Failed nodes would retry connection attempts on every tree refresh
- Multiple failing nodes caused cascading delays
- No clear way to retry specific failed operations
After:
- Failed nodes show cached error messages instantly on refresh
- Clear "Click here to retry" button with refresh icon for explicit retry
- Each node can be retried independently without affecting others
- Telemetry tracking for cache usage monitoring
Example Usage
When a connection fails due to invalid credentials:
- Error node displays: "Error: Authentication failed" + "Click here to retry" button
- Subsequent tree refreshes return cached error instantly (no retry attempts)
- User clicks retry button → clears error cache → attempts fresh connection
- If successful, normal tree structure returns; if failed, error is cached again
Testing
- 14 comprehensive test cases covering all error caching scenarios
- Tests for cache management, retry functionality, and edge cases
- Full TypeScript compilation verification
- No breaking changes to existing functionality
This implementation follows the exact pattern successfully used in the DocumentDB extension, providing the same user experience improvements while maintaining minimal code changes.
Fixes #2700.
[!WARNING]
Firewall rules blocked me from connecting to one or more addresses
I tried to connect to the following addresses, but was blocked by firewall rules:
update.code.visualstudio.com
- Triggering command:
node /home/REDACTED/work/vscode-cosmosdb/vscode-cosmosdb/node_modules/.bin/vscode-test(dns block)If you need me to access, download, or install something from one of these locations, you can either:
- Configure Actions setup steps to set up my environment, which run before the firewall is enabled
- Add the appropriate URLs or hosts to my firewall allow list
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.
I updated the original implementation created by Copilot and simplified it.
While working on it, I had to make contextValue a non-read-only value so that a branch data provider can attach the context of the current branch, so that later, we know, after looking at an element, which branch is the one to be refreshed after a retryOperation call.
Now, error nodes are being cached. In the example below the time-to-error is brief as I just modified my authentication details there, but this will kick-in for every error. This will improve the overall responsiveness of the tree view.
https://github.com/user-attachments/assets/21eba54b-0787-4f60-bf93-018ccf3e7010
@bk201- Would you have time to review that one for the upcoming release? It helps with the responsiveness of the tree view around error nodes, but in order to make it work, I made the contextValue a non-read-only value everywhere.
@tnaum-ms
What a reason making contextValue non read only? For many reasons every public property should be read only or has public getter and private setter (public only in case it is really required). At this point I don't see any reason to make it fully public. No one outside the class can't change this property. If you really want to change it there are constructor and getTreeItem() methods to modify this value.
@bk201- I just saw your comment now. Sorry for the late reply!
I'm integrating our error handling where getChildren in the base data provider already does the job. I'm receiving the items as they are, without calling the constructor myself. Also, at that point, it's not yet a TreeItem... It's here:
https://github.com/microsoft/vscode-cosmosdb/blob/e06cf3badc576c149c587058bd9b672bcd273c6c/src/tree/BaseCachedBranchDataProvider.ts#L172-L178
I considered adding extra information to the data provider's context, but that would require changes to every data provider we have.
That’s why I ended up removing the readonly from contextValue... it was the simplest solution with the least amount of work.
This PR is now outdated. I'll close it and we'll return to it when the time is right.