localstack bug: The EventSourceListener class is inefficient and can cause the API to report innacurate results

The event source mapping API does not communcate directly with EventSourceListener classes. Instead, it works by maintaining and internal state of event source mapping configurations, and event source listeners must continuously check this state to respond to any changes. This is achieved within the EventSourceListener.start() method, which all subclasses must implement. This method is described as

Start listener in the background (for polling mode)

In practice, this method must do the following work continuously (once every second):

get all the current event source mapping objects using lambda_api.get_event_sources()
compare the state of these objects against the state of the currently running event source executor threads (an "executor thread" in this case meaning a thread that actually does the work of listening to the source and forwarding the data to a lambda function, e.g. "read data records from this kinesis shard and send them to this destination lambda")
make the state of the running threads match the state of the mapping configurations - e.g. if a there is a mapping that does not have a corresponding thread running, instantiate and run a new thread

This strategy is problematic because every single EventSourceListener type must continuously and repeatedly fetch the current state of the configured event source mappings in order to respond to any changes that may have occurred. This creates a lot of repeated work, and means that the get_event_source_mapping API may return incorrect results. For example, a user could delete a mapping then immediately use the API to get that same mapping and the state will show as ACTIVE as long as these operations occur within less than a second. This is problematic when running our integration test suite: https://github.com/localstack/localstack/pull/5807#discussion_r860104978

It also makes it challenging to respect changes to event source mappings for some source types: https://github.com/localstack/localstack/issues/5982

In order to solve this issues, I propose we introduce some kind of EventSourceListenerManager singleton class. This class should know and manage the state of all event source listeners. When a user creates/reads/updates/deletes an event source mapping using the API, it should delegate the manager class to make the appropriate adjustment to the underlying event source listener. This operation should happen synchronously, before the API returns a response to the client. Under this implementation, event source listeners should not have to check or even have access to the state of event source mappings. They should only have to control their executor threads.

May 02 '22 19:05 spwats

@HarshCasper I want to work on this

Oct 17 '22 20:10 soma2000-lang

Hi @soma2000-lang , 👋 thanks for expressing your interest in contributing to this issue!

Not sure if this would be in the scope of a Hacktoberfest contribution @HarshCasper , as this will require some pretty fundamental design decisions in LocalStack. We are currently in the process of revamping our Lambda provider, which also involves significant changes to the event source mappings logic (e.g., PR #7032 which touches parts of it). Will let @dominikschubert @dfangl chime in with more details and guidance.. Can we spec this out such that it would be suitable for Hacktoberfest and realistically shepherd a PR for it?

Oct 17 '22 21:10 whummer