seldon-core Change service key to allow container services to always match correctly

What this PR does / why we need it:

Services created for each node in the inference graph were using the same label key which means only 1 per pod would be active. This is ok for when the service orchestrator is in same pod as it would not use the services but use localhost directly.

The change is to create a label per node name so services are always correctly finding their pods.

Adds unique service label
Adds an example notebook with tests for various tranform-model-transform flows.

Which issue(s) this PR fixes:

Fixes #4036

Special notes for your reviewer:

Apr 08 '22 16:04 ukclivecox

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

Apr 08 '22 16:04 review-notebook-app[bot]

/test integration

Apr 08 '22 16:04 ukclivecox

/test notebooks

Apr 08 '22 16:04 ukclivecox

/test integration

Apr 08 '22 17:04 ukclivecox

/test notebooks

Apr 11 '22 08:04 ukclivecox

/test integration

Apr 11 '22 09:04 ukclivecox

/test notebooks

Apr 11 '22 09:04 ukclivecox

It seems all integration tests passed (not sure why it's marked as failed as here it says pass) and only 1 notebook test failed (which is often flaky)

Edit: ok it seems the parallel tests are failing for the integration tests

Apr 19 '22 07:04 axsaucedo

/test integration

Apr 19 '22 07:04 axsaucedo

/test notebooks

Apr 19 '22 07:04 axsaucedo

Ok it seems like the operator upgrade tests are still failing and the tracing test is still failing, so it may be an issue - rerunning to confirm

Apr 19 '22 13:04 axsaucedo

/test integration

Apr 19 '22 13:04 axsaucedo

/test notebooks

Apr 19 '22 13:04 axsaucedo

/test integration

Apr 21 '22 07:04 axsaucedo

/test notebooks

Apr 21 '22 07:04 axsaucedo

/test notebooks

Apr 21 '22 15:04 axsaucedo

@cliveseldon I've been testing this locally, it seems like all works well as the svcorch model does work in 1.14.0-dev. I am finding some strange behaviour, but it's not clear it's form this PR nor whether it's only me - namely I am running some tests and I'm finding some strange behaviour, I'm currently testing in one of Clive's branches, but when i run the helm upgrade it doesn't trigger a model container bounce for the upgrade from 1.13.1 -> 1.14.0-dev for some strange reason (but it does for 1.12.0->1.13.1 as well as for 1.12.0->1.14.0-dev), is this behaviour consistent for you as well?

Apr 22 '22 07:04 axsaucedo

/test notebooks

Aug 27 '22 12:08 ukclivecox

Screenshot_2022-08-27_15-37-02

Aug 27 '22 14:08 ukclivecox

/test integration

Aug 27 '22 14:08 ukclivecox

/test integration

Sep 05 '22 10:09 axsaucedo

Seems only flaky test is the rolling upgrade from 1.14.0, from discussion it's expected the rolling updates to potentially fail so we should be good to merge, re-running to validate flakiness on this specific test

/test integration

Sep 05 '22 15:09 axsaucedo

@cliveseldon: The following test failed, say /retest to rerun them all:

Test name	Commit	Details	Rerun command
integration	3c489380348757a6625e100c63cfc89469b7e0e6	link	`/test integration`

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the jenkins-x/lighthouse repository. I understand the commands that are listed here.

Sep 05 '22 18:09 seldondev

It seems it was indeed flaky, as now the failed one was the test_label_update[1.13.1] - looks good to merge /approve

Sep 06 '22 09:09 axsaucedo

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: axsaucedo

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [axsaucedo]

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

Sep 06 '22 09:09 seldondev

seldon-core seldon-core copied to clipboard

Change service key to allow container services to always match correctly

seldon-core
seldon-core copied to clipboard