AI jobs getting `senderNonce: too many values`
Describe the bug
When sending AI jobs for expensive models (such as DeepSeek), or in the case of LLM pipeline, sending large max_tokens parameter such as 163K tokens, it causes a lot of payment tickets to be sent at once.
The Orch will show this message.
Error receiving ticket sessionID=33_meta-llama/Meta-Llama-3.1-8B-Instruct recipientRandHash=7905016d8d201e4bb0d13f78234e107018b2effe42343325691a55844a1d54cf senderNonce=178: invalid ticket senderNonce: too many values sender=0x5bE44e23041E93CDF9bCd5A0968524e104e38ae1 nonce=178
There is a nonce cap of 150 currently. We need to allow infinite nonce count to be accepted by the Orch, or some way to manage this limit.
For instance, if LLM context windows keep increasing or price keep getting higher, which I believe it will, we need higher throughput of tickets to be redeemed.
This is also prevalent when multiple jobs are sent at the same time, the ticket nonce stacks up and will reach the limit quickly.
To Reproduce Steps to reproduce the behavior:
- Start AI Gateway
- Start Orchestrator with 7 USD per 1 million tokens
- Send LLM request with 163K
max_tokensparameter - See error
I believe it is a long-standing bug, when there are many tickets sent at once, we are sometimes getting this one.
We can play with ticketEV to receive less tickets, this is what we've being doing so far.
I think it would be nice for Gateways to be able to use a range of ticket EV instead of one number set by the orchestrator. Gateways can setup different gateways for each pipeline to adjust the ticketEV parameters that are acceptable. Orchestrators have to try and set one ticketEV that will apply to all pipelines which creates situations of significant overpayment on some pipelines and on some others we will have too many tickets.
If we use a range of ticketEV we can use a large maxTicketEV (say 10000 gwei) could be set that is quite large but would cover larger payments needed for things like text-to-video . minTicketEV could be 8 gwei which is the current default ticketEV to support smaller payment streaming used in transcoding and live video AI. This would help minimize overpayments if the Gateway can right size the ticket EV to get within 10-20% of payment amount needed.
A next step that would help visualize this is metrics or log lines that indicate the balances remaining when payment/balance sessions are cleaned up.
Hmm, and what the idea that the ticketEV would be set by Gateway instead of Orchestrator? Then, we could optimize to "always" send 1 ticket only. Wdyt?
@Titan-Node this was updated to 600 recently. Will this cover your needs?
@leszko I think setting the ticketEV at the gateway sounds like a good idea but I also think Orchestrators should be able to set a minimum to avoid getting tickets extremely small.