atmos
atmos copied to clipboard
Add file locking support for cache operations on Windows
Description
The current XDG cache implementation (introduced in the feature/xdg-cache-implementation branch) does not implement file locking on Windows. This was a deliberate decision to avoid timeout issues that were observed during testing, but it means that concurrent cache operations on Windows may experience race conditions.
Current Behavior
- On Unix-like systems: File locking is implemented using
github.com/gofrs/flockto ensure atomic cache operations - On Windows: No file locking is implemented (
cache_lock_windows.gosimply executes operations without locking) - Concurrent cache tests are skipped on Windows since locking is disabled
Impact
- The cache is used for non-critical functionality (update checks, telemetry)
- Race conditions on Windows could potentially result in:
- Corrupted cache files (though atomic writes help mitigate this)
- Lost updates when multiple processes update the cache simultaneously
- Empty or partially written cache files being read
Potential Solutions
- Investigate Windows-specific file locking mechanisms that don't cause timeout issues
- Implement a mutex-based approach using a lock file
- Use Windows-specific APIs for file locking (e.g., LockFileEx)
- Consider using a different locking library that has better Windows support
Related Code
-
pkg/config/cache_lock_windows.go- Current no-op implementation -
pkg/config/cache_lock_unix.go- Unix implementation using flock -
pkg/config/cache_test.go- Tests that skip on Windows due to lack of locking -
pkg/config/cache_atomic_test.go- Concurrent tests that skip on Windows
Acceptance Criteria
- [ ] Implement file locking on Windows that doesn't cause timeout issues
- [ ] Re-enable concurrent cache tests on Windows
- [ ] Ensure no deadlocks or excessive delays in cache operations
- [ ] Maintain backward compatibility with existing cache files
Additional Context
This issue was identified during the implementation of XDG cache directory support. The decision was made to ship without Windows locking support since:
- The cache is for non-critical functionality
- Windows timeout issues were blocking the PR
- The atomic write implementation provides some protection against corruption
Reference PR: [XDG Cache Implementation PR]