Refactor internal components
Proposal
I've been thinking about our code and I think that it's a bit coupled, making difficult to manage the life cycle of the items internally sometimes. (During the last months we have had several issues related with the scaler lifecycle, not properly closed connections, etc) and I don't like so much some parts of the core code. IMHO there are mixed responsibilities about what each thing does, and we have some duplications (almost the same code) because the querying way when internal scale loop is slightly different from the HPA querying way. Although the goal and the main idea is quite similar, the code is duplicated. This idea has been in my head for some time and recent issues related with the scaler lifecycle and unclosed connections have pushed it out.
My proposal is to split the responsibilities in different layers (I'm not speaking about tons of interfaces, DTOs and clean code patterns at all, just from resposibities pov), where we manage each layer isolated from others, not sharing the internal information to avoid coupling them. Disclaimer: Please, ignore the term "service", I used it to make a difference between our controllers (strongly related with k8s controllers) and the controllers/layers/services.
The main idea is that instead of querying metrics in different ways, we have a single layer (scaler service) where all the scalers of an scalableobject (SO/SJ) live, and other layers on top that consume this scaler layer but totally as blackbox. Let's explain the SO flow:
- A ScaledObject is created
- ScaledObject controller creates the HPA and calls the scaledobject service to register the ScaledObject
- The ScaledObject service pass the ScaledObject to the scalableobject service (a common service for SO and SJ) and starts a loop that queries every polling interval the status of the ScaledObject (Active/Inactive).
- If the scaler changes the status, it scales from/to 0
- The scalableobject service parses the scaledobject received and call to the scaler service to create the required scalers.
- If the scaler configurations/types/etc changes, this service call to the scaler service to update/remove the scalers
- This service can be queried about the scalableobject state or just about a single metric
- This service calls to fallback layer to calculate the fallback if the SO defines it and there is a fail on a scaler
- This service calls to the formula calculator if required
- The scaler service is the responsible for creating/refreshing/deleting the scalers when it's required, but also is the responsible for querying the specific scaler.
- Scalers are treated as single items, not knowing about other scalers in the SO/SJ
- This service stores (or not) the metric based on an arg, e.g: we could use an enum to defice if we have to READ,READ&STORE or READFROMCACHE
- The service handles scaler recreation on demand when it's required (e,g: failures)
- The service handles the scaler creation, including the arrange of envs from auth sources
This is a good improvement that makes the structure and responsibilities cleaner. I just have a few detailed questions:
- Do the ScaledJob controller and ScaledObject controller each have a looping structure? Will every ScaledJob/ScaledObject start a loop?
- Is it correct that the cached metric value of the scaler will only be refreshed when a ScaledJob/ScaledObject calls after the poll interval or when the metric server is queried?
- The formula calculator and fallback layer are called by the scalableobject service, but each ScaledJob/ScaledObject's definition is different. Will the scalableobject service query back to the ScaledJob/ScaledObject controller to retrieve the definition, or will the ScaledJob/ScaledObject controller pass each ScaledJob/ScaledObject's formula and fallback definition to the formula calculator and fallback layer during creation?
Do the ScaledJob controller and ScaledObject controller each have a looping structure? Will every ScaledJob/ScaledObject start a loop?
Yeah, that's the current approach. Atm, ScaledObjectController and ScaledJobController start a loop per object. My suggestion follows that approach too
Is it correct that the cached metric value of the scaler will only be refreshed when a ScaledJob/ScaledObject calls after the poll interval or when the metric server is queried?
Yes, that's the current approach to reduce the amount of request to upstream
The formula calculator and fallback layer are called by the scalableobject service, but each ScaledJob/ScaledObject's definition is different. Will the scalableobject service query back to the ScaledJob/ScaledObject controller to retrieve the definition, or will the ScaledJob/ScaledObject controller pass each ScaledJob/ScaledObject's formula and fallback definition to the formula calculator and fallback layer during creation?
I think that the service should have the ScaledObject/Job and calculate all the things from there, personally, I think that we should try to make the calls in one direction, giving all the information that the next layer needs from the current one.
@kedacore/keda-core-maintainers @kedacore/keda-core-contributors WDYT about this?