haystack-core-integrations
haystack-core-integrations copied to clipboard
AmazonBedrockGenerator shouldn't load tokenizer if truncate is set to False.
Is your feature request related to a problem? Please describe.
DefaultPromptHandler downloads unnecessary GPT2 tokenizer to a read-only filesystem (eg. AWS Lambda), when truncate argument is set to False.
Check https://github.com/deepset-ai/haystack-core-integrations/blob/cf52ce94c9a1ae7f33bbde14c8639e2058262707/integrations/amazon_bedrock/src/haystack_integrations/components/generators/amazon_bedrock/generator.py#L148-L156
Describe the solution you'd like If truncate is set to false, GPT2 tokenizer for token estimation shouldn't be loaded.
Describe alternatives you've considered Simply place a condition to only load the tokenizer if truncate is set to true.
# Truncate prompt if prompt tokens > model_max_length-max_length
# (max_length is the length of the generated text)
# we use GPT2 tokenizer which will likely provide good token count approximation
---> if self.truncate:
self.prompt_handler = DefaultPromptHandler(
tokenizer="gpt2",
model_max_length=model_max_length,
max_length=self.max_length or 100,
)