rig
rig copied to clipboard
feat(anthropic): update model params + better max_token handling
Updates model const
s to link to latest models (didn't exist prior). Also adds a function to calculate a default max token size based on the model name when not provided.
Rant
Anthropic unfortuntely has two annoying thorns:
-
Anthropic docs about models only lists the "latest" snapshots for each of the available models. They also only recently added the
-latest
as an override over the specific snapshot number, though only for the top models. This means that the constants can't reliably keep a history of all of the available snapshots available, even though the docs recommend using snapshots for stability purposes.- The current recommendation by Rig should be to use
-latest
for testing but leverage a specific snapshot for a model for stability purposes in production.
- The current recommendation by Rig should be to use
-
Anthropic endpoints for
messages
requires amax_tokens
argument to be specified, which is unlike other providers. This is even more frustrating since, thismax_tokens
argument that needs to be specified has a different cap per model being used (and specifying too high of a number causes the request to fail).- Hardcoding specific models to the specific
max_tokens
is a non-starter since users using specific snapshot models (as the docs recommend) wouldn't match. - Using a lower cap like
4096
would cut off half of the available token space for the most common models. - Requiring a
max_tokens
argument to be specified at compile time (onAgentBuilder
and manually when creatingCompletionRequestBuilder
s) is also tough because it would require some really ugly refactoring to enforce that these builders can only build specifically for Anthropic clients (basically a customAnthropicAgentBuilder
and aAnthropicCompletionRequestBuilder
). - This theoretically is better bc it's code duplication for best compile time DX but this might get refactored soon, I'd rather not add more troubles to that implementation.
- Hardcoding specific models to the specific
The solution to the last thorn is to match the beginning of the model string to the model names and hardcode a default value for token size based on that. The user can override this by specifying max_tokens
on AgentBuilder
, etc. This would error on agent.completion
if a default max token cannot be determined, most likely due to an invalid anthropic model (which probably doesn't exist).
There might be a better solution, but I deemed this "good enough" after going in circles a bit.