camel-k icon indicating copy to clipboard operation
camel-k copied to clipboard

Make healt trait as default

Open squakez opened this issue 1 year ago • 8 comments

Considering the issue reported in #4977 and the improvements done in the monitoring, I wonder if we should enable by default the health trait. Right now, it's disabled by default, but I think we should change the behavior in order to have a more reliable experience when starting up an Integration which should be able to check at least the readiness probe before turning the Integration as running.

squakez avatar Jan 05 '24 10:01 squakez

@lburgazzoli wdyt?

squakez avatar Jan 05 '24 10:01 squakez

That looks like a good idea but I wonder how would we work the default probe path on non-quarkus runtime ? As it is activating by default the health trait mean having readiness and liveness active by default. For now the default probe path are hard-coded here : https://github.com/apache/camel-k/blob/cae9e899510656f71fecc0131cc8a0aaf6e8fcfe/pkg/trait/health.go#L34-L36

gansheer avatar Jan 05 '24 12:01 gansheer

In theory we should be safe as those endpoints should be exposed by Camel regardless the runtime used. I'd expect their availability also on non Quarkus runtimes.

squakez avatar Jan 05 '24 12:01 squakez

@squakez I think it makes a lot of sense @gansheer you are right, as today camel-k assumes the runtime is quarkus hence the health endpoint path is hard-coded to the quarkus default one so once we add support for an additional runtime, then we have to make those path runtime dependant.

As a side note, we should start supporeting/sugin the quarkus/spring-boot management port for non "business" traffic

lburgazzoli avatar Jan 08 '24 08:01 lburgazzoli

Enabling this trait by default (with readiness probe only) would have the following side effects [1]:

  • Debug feature is not working, so, we should disable the trait explicitly when running in debug mode
  • An errored route would be never in ready state (that should be fine, but we have tests that are checking the Integration is running and is logging certain errors)
  • It seems that the health is affecting the execution of cron (or the test which result in error). I don't have yet a clear understanding of what's happening.
  • It is affecting the master trait (here maybe for good as it seems that the switch from a Pod to another is immediate, so, only the test may require some attention)

[1] https://github.com/apache/camel-k/actions/runs/8849672619?pr=5096

squakez avatar Apr 29 '24 14:04 squakez

* It seems that the health is affecting the execution of cron (or the test which result in error). I don't have yet a clear understanding of what's happening.

The problem with the cron is the fact that our test is very quick to execute, therefore, the Pod is finished before reaching the Ready state. Maybe it makes sense that our tests have some artificial delay in order to simulate a workload that is above the few millis required to print a text to output.

cron-yaml-28574444-trd9t            0/1     Pending     0          0s
cron-yaml-28574444-trd9t            0/1     ContainerCreating   0          0s
cron-yaml-28574444-trd9t            0/1     Running             0          1s
cron-yaml-28574444-trd9t            0/1     Completed           0          3s

squakez avatar Apr 30 '24 08:04 squakez

We must reopen this because it turned out the change would be a breaking compatibility one. More details in #5516.

squakez avatar May 16 '24 14:05 squakez