go-livepeer icon indicating copy to clipboard operation
go-livepeer copied to clipboard

Livepeer.Cloud SPE - Proposal #2 - Enable Single Orchestrator AI Job Testing Support for Gateway Nodes

Open mikezupper opened this issue 1 year ago • 3 comments

What does this pull request do? Explain your changes. (required) Provide features that enables AI Job Testing through gateway nodes. The gateway node has several hard-coded timeouts/cache values that need to be configurable to allow the a gateway to send an AI job to a specific orchestrator for testing.

Specific updates (required)

  • Introduce several new startup flags to enable a gateway node to support AI Job testing.
    • aiTesterGateway - a boolean that enables the gateway node to bypass AI Session caching. Defaults to false to prevent any behavior changes to the default gateway node.
    • aiSessionTimeout - a duration value that allows the AI session timeouts to be configured to a desired value. The default is 600s to match the existing hard-coded value.
    • webhookRefreshInterval - a duration value that allows the orchWebhookUrl cached responses to be configured to a desired value. The default is 60s to match the existing hard-coded value.
    • LIVEPEER_OS_HTTP_TIMEOUT - This is an environment variable (ENV Var). The code is standalone and cannot use common livepeer flags. The variable is a duration value that allows the AI assets (.mp4 files, etc...) download timeout to be configured to a desired value. The default is 4s to match the existing hard-coded value.
  • A new HTTP endpoint was added to fetch all AI capabilities of each orchestrator (/getOrchestratorAICapabilities). This endpoint provides the AI Job Tester with information on all AI models available for the all orchestrators.

How did you test each of these updates (required) Each of the new flags and timeout values were manually tested in our development environments. They are also deployed to the testing and production Livepeer.Cloud SPE AI Gateway nodes.

Does this pull request close any open issues? No

Checklist:

  • [ X] Read the contribution guide
  • [ X] make runs successfully
  • [ X] All tests in ./test.sh pass
  • [ ] README and other documentation updated
  • [ ] Pending changelog updated

mikezupper avatar Nov 07 '24 14:11 mikezupper

@thomshutt this pull request has overlap with https://github.com/livepeer/go-livepeer/pull/3246 which followed https://github.com/livepeer/go-livepeer/pull/3052. It can be merged after that one is merged and the pull request is rebased.

rickstaa avatar Nov 22 '24 00:11 rickstaa

@mikezupper @thomshutt what's the plan for this PR? Do we plan to review/merge/productionize it?

I can review and help with that, but I'd like to know what's the plan in the context of the AI Video work.

leszko avatar Jan 02 '25 08:01 leszko

@mikezupper we did merge https://github.com/livepeer/go-livepeer/commit/cc5663f2a38016daa0bb5c05a5ce7ebf47afc69b which included some of this functionality. Maybe you could rebase this pull request and remove the duplication. I think after that is done it should not be hard to review and merge it in

rickstaa avatar May 16 '25 23:05 rickstaa