serving Don't scrape pods via HTTP if the activator in path.

When the activator is in the routing path, scraping the pods is kinda useless as their information is effectively zeroed out, because all requests are considered "proxied", thus the activator metrics take precedence. We can use that to make our system more efficient in these cases!

I propose two steps to go about this:

Never scrape if TBC=-1. The activator will always be in path in this case, so we never need to scrape really.
Start/Stop scraping if activator is in path or not. This is a little more tricky as we need to make sure that we start scraping timely once we notice the activator will be taken off the path.

As the PodAutoscaler owns all of these decisions, this should be doable for us. As it's a good-first-issue, I'd love to see it land in two separate pieces as laid out though.

Note: I'm happy to mentor first-time contributors here. The changes needed should be fairly small in terms of number of files to touch.

Mar 19 '20 14:03 markusthoemmes

@markusthoemmes: This request has been marked as suitable for new contributors.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed by commenting with the /remove-good-first-issue command.

In response to this:

When the activator is in the routing path, scraping the pods is kinda useless as their information is effectively zeroed out, because all requests are considered "proxied", thus the activator metrics take precedence. We can use that to make our system more efficient in these cases!

I propose two steps to go about this:

Never scrape if TBC=-1. The activator will always be in path in this case, so we never need to scrape really.

Start/Stop scraping if activator is in path or not. This is a little more tricky as we need to make sure that we start scraping timely once we notice the activator will be taken off the path.

As the PodAutoscaler owns all of these decisions, this should be doable for us. As it's a good-first-issue, I'd love to see it land in two separate pieces as laid out though.

Note: I'm happy to mentor first-time contributors here. The changes needed should be fairly small in terms of number of files to touch.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Mar 19 '20 14:03 knative-prow-robot

I'd like to give it a try to look into Serving internals a bit more. /assign dsimansk

Mar 19 '20 15:03 dsimansk

/cc @taragu BTW, this is also one of the ways you can improve GSD, if we extend the activator reporting capabilities (not necessarily a good idea, but something to consider).

Mar 19 '20 15:03 vagababov

The rest of this is not a good first issue anymore and requires a fair bit of caution. I'm happy to help people through it, but it's much more involved. Also kinda blocked on #8377 to refactor the scraper in a way to make controlling it's lifetime more predictable.

Jun 29 '20 14:06 markusthoemmes

This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen the issue with /reopen. Mark the issue as fresh by adding the comment /remove-lifecycle stale.

Sep 28 '20 01:09 github-actions[bot]

/lifecycle frozen

Probably still want this

Sep 28 '20 01:09 vagababov

/triage accepted

Mar 22 '21 05:03 evankanderson

/assign

Apr 29 '22 19:04 nader-ziada

Related PR: https://github.com/knative/serving/pull/13027

If there are no ready activators but ready pod endpoints requests will bypass the activator even if TBC=-1

Jun 29 '22 18:06 dprotaso

/unassign @nader-ziada

Jan 11 '24 16:01 dprotaso