Livepeer.Cloud SPE - Proposal #2 - Enable Single Orchestrator AI Job Testing Support for Gateway Nodes
What does this pull request do? Explain your changes. (required) Provide features that enables AI Job Testing through gateway nodes. The gateway node has several hard-coded timeouts/cache values that need to be configurable to allow the a gateway to send an AI job to a specific orchestrator for testing.
Specific updates (required)
- Introduce several new startup flags to enable a gateway node to support AI Job testing.
- aiTesterGateway - a boolean that enables the gateway node to bypass AI Session caching. Defaults to false to prevent any behavior changes to the default gateway node.
- aiSessionTimeout - a duration value that allows the AI session timeouts to be configured to a desired value. The default is 600s to match the existing hard-coded value.
- webhookRefreshInterval - a duration value that allows the orchWebhookUrl cached responses to be configured to a desired value. The default is 60s to match the existing hard-coded value.
- LIVEPEER_OS_HTTP_TIMEOUT - This is an environment variable (ENV Var). The code is standalone and cannot use common livepeer flags. The variable is a duration value that allows the AI assets (.mp4 files, etc...) download timeout to be configured to a desired value. The default is 4s to match the existing hard-coded value.
- A new HTTP endpoint was added to fetch all AI capabilities of each orchestrator (/getOrchestratorAICapabilities). This endpoint provides the AI Job Tester with information on all AI models available for the all orchestrators.
How did you test each of these updates (required) Each of the new flags and timeout values were manually tested in our development environments. They are also deployed to the testing and production Livepeer.Cloud SPE AI Gateway nodes.
Does this pull request close any open issues? No
Checklist:
- [ X] Read the contribution guide
- [ X]
makeruns successfully - [ X] All tests in
./test.shpass - [ ] README and other documentation updated
- [ ] Pending changelog updated
@thomshutt this pull request has overlap with https://github.com/livepeer/go-livepeer/pull/3246 which followed https://github.com/livepeer/go-livepeer/pull/3052. It can be merged after that one is merged and the pull request is rebased.
@mikezupper @thomshutt what's the plan for this PR? Do we plan to review/merge/productionize it?
I can review and help with that, but I'd like to know what's the plan in the context of the AI Video work.
@mikezupper we did merge https://github.com/livepeer/go-livepeer/commit/cc5663f2a38016daa0bb5c05a5ce7ebf47afc69b which included some of this functionality. Maybe you could rebase this pull request and remove the duplication. I think after that is done it should not be hard to review and merge it in