ocean
ocean copied to clipboard
[Integration][AWS] | Improved Concurrency Control and Eliminated Likelihood of Thundering Herd
Description
What:
- Refactored semaphore implementation to effectively limit concurrency across tasks.
- Added a util
semaphore_async_iteratorto enable seamless control over concurrent executions per kind (can be re-used in other integrations). - Removed iterative calls to the cache for tracking token expiry, reducing the likelihood of a thundering herd problem.
Why:
- The previous implementation limited concurrency within tasks (accounts), ideally, this use case requires concurrent limits is to be global, thus across accounts.
- Iterative cache calls could potentially cause a thundering herd problem when cache expires and token needed to be refreshed.
How:
- Applied the semaphore correctly by wrapping task creation with the semaphore, ensuring proper concurrency control.
- Implemented unit tests using
pytestto verify semaphore functionality and concurrency limits. - Optimized cache usage by eliminating unnecessary iterative calls for tracking expiry.
Type of change
Please leave one option from the following and delete the rest:
- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] New Integration (non-breaking change which adds a new integration)
- [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
- [x] Non-breaking change (fix of existing functionality that will not change current behavior)
- [ ] Documentation (added/updated documentation)
All tests should be run against the port production environment (using a testing org).
Core testing checklist
- [x] Integration able to create all default resources from scratch
- [x] Resync finishes successfully
- [x] Resync able to create entities
- [x] Resync able to update entities
- [x] Resync able to detect and delete entities
- [x] Scheduled resync able to abort existing resync and start a new one
- [x] Tested with at least 2 integrations from scratch
- [ ] Tested with Kafka and Polling event listeners
- [x] Tested deletion of entities that don't pass the selector
Integration testing checklist
- [x] Integration able to create all default resources from scratch
- [x] Resync able to create entities
- [x] Resync able to update entities
- [x] Resync able to detect and delete entities
- [x] Resync finishes successfully
- [x] If new resource kind is added or updated in the integration, add example raw data, mapping, and expected result to the
examplesfolder in the integration directory. - [x] If resource kind is updated, run the integration with the example data and check if the expected result is achieved
- [x] If new resource kind is added or updated, validate that live-events for that resource are working as expected
- [ ] Docs PR link here
Preflight checklist
- [x] Handled rate limiting
- [x] Handled pagination
- [x] Implemented the code in async
- [x] Support Multi account
Screenshots
NB: Warning log in the screenshot below is just for demonstration purposes.
API Documentation
Provide links to the API documentation used for this integration.