haystack-core-integrations AmazonBedrockGenerator shouldn't load tokenizer if truncate is set to False.

AmazonBedrockGenerator shouldn't load tokenizer if truncate is set to False.

Open hypathia opened this issue 1 year ago • 0 comments

Is your feature request related to a problem? Please describe. DefaultPromptHandler downloads unnecessary GPT2 tokenizer to a read-only filesystem (eg. AWS Lambda), when truncate argument is set to False.

Check https://github.com/deepset-ai/haystack-core-integrations/blob/cf52ce94c9a1ae7f33bbde14c8639e2058262707/integrations/amazon_bedrock/src/haystack_integrations/components/generators/amazon_bedrock/generator.py#L148-L156

Describe the solution you'd like If truncate is set to false, GPT2 tokenizer for token estimation shouldn't be loaded.

Describe alternatives you've considered Simply place a condition to only load the tokenizer if truncate is set to true.

        # Truncate prompt if prompt tokens > model_max_length-max_length
        # (max_length is the length of the generated text)
        # we use GPT2 tokenizer which will likely provide good token count approximation
--->    if self.truncate:
            self.prompt_handler = DefaultPromptHandler(
                tokenizer="gpt2",
                model_max_length=model_max_length,
                max_length=self.max_length or 100,
            )

Oct 04 '24 10:10 hypathia

haystack-core-integrations haystack-core-integrations copied to clipboard

AmazonBedrockGenerator shouldn't load tokenizer if truncate is set to False.

haystack-core-integrations
haystack-core-integrations copied to clipboard