haystack feat: Amazon SageMaker expand inference input format

Related Issues

Thank you to the contributors who worked on #5155. One of the bullet points in the Notes for the review section of #5155 is

The request format is tight to Hugging Face models hosted on SageMaker (we might in the future consider, adding a customization option to allow any other, arbitrary model format)

I would like to discuss and work on how Haystack can support other inference inputs to models hosted on Amazon SageMaker.

Proposed Changes:

Decouple model parameters from inference layer
Add model parameters for AI21 Jurassic 2 Complete API
Add model parameters for AI21 Contextual Answers API
Add inference layer for AI21 Jurassic 2 Complete hosted on Amazon SageMaker
Add inference layer for AI21 Contextual Answers hosted on Amazon SageMaker

How to use it?

Deploy AI21 Jurassic 2 Complete or AI21 Contextual Answers model with Amazon SageMaker JumpStart
- https://aws.amazon.com/blogs/machine-learning/use-proprietary-foundation-models-from-amazon-sagemaker-jumpstart-in-amazon-sagemaker-studio/
- https://github.com/AI21Labs/SageMaker
Initialize and run PromptNode for the AI21 model running on Amazon SageMaker

# Initialize the node using AI21 Jurassic 2 Complete with endpoint name from Amazon SageMaker:
prompt_node = PromptNode(model_name_or_path="j2-mid",
                    model_kwargs={"aws_access_key_id": os.getenv("AWS_ACCESS_KEY_ID"),
                                "aws_secret_access_key": os.getenv("AWS_SECRET_ACCESS_KEY"),
                                "aws_session_token": os.getenv("AWS_SESSION_TOKEN"),
                                "aws_region_name": "us-east-1"})
prompt="""Write an engaging product description for a clothing eCommerce site. Make sure to include the following features in the description.
Product: Humor Men's Graphic T-Shirt.
Features:
- Soft cotton
- Short sleeve
- Have a print of Einstein's quote: "artificial intelligence is no match for natural stupidity”
Description:
"""

res = prompt_node(prompt, maxTokens=100, temperature=0.9, numResults=3)

or for AI21 Contextual Answers:

# Initialize the node using AI21 Contextual Answers with endpoint name from Amazon SageMaker:
prompt_node = PromptNode(model_name_or_path="contextual-answers", max_length=256,
                    model_kwargs={"aws_access_key_id": os.getenv("AWS_ACCESS_KEY_ID"),
                                "aws_secret_access_key": os.getenv("AWS_SECRET_ACCESS_KEY"),
                                "aws_session_token": os.getenv("AWS_SESSION_TOKEN"),
                                "aws_region_name": "us-east-1"})

context = "The tower is 330 metres (1,083 ft) tall,[6] about the same height as an 81-storey building, and the tallest structure in Paris. Its base is square, measuring 125 metres (410 ft) on each side. During its construction, the Eiffel Tower surpassed the Washington Monument to become the tallest human-made structure in the world, a title it held for 41 years until the Chrysler Building in New York City was finished in 1930. It was the first structure in the world to surpass both the 200-metre and 300-metre mark in height. Due to the addition of a broadcasting aerial at the top of the tower in 1957, it is now taller than the Chrysler Building by 5.2 metres (17 ft). Excluding transmitters, the Eiffel Tower is the second tallest free-standing structure in France after the Millau Viaduct."

question="What is the height of the Eiffel tower?"


res = prompt_node(question,context=context, question=question)

How did you test it?

Applied the same unit and integration tests as #5155

Notes for the reviewer

Recreating the models parameters for every model is probably no the most scalable solution to expand to different models. An alternative could be to not check the input in the inference layer to allow compatibility with any model hosted on Amazon SageMaker. Let's discuss this an other potential designs. 😊

Checklist

I have read the contributors guidelines and the code of conduct
I have updated the related issue with new insights and changes
I added unit tests and updated the docstrings
I've used one of the conventional commit types for my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test:.
I documented my code
I ran pre-commit hooks and fixed any issue

Jul 01 '23 10:07 malte-aws

Hello, @malte-aws, and thanks for the contribution!

The SageMaker support has been heavily reworked and expanded in #5205.

As you can see, now there are several conflicts. Could you rebase your PR to the main?

Jul 03 '23 08:07 anakin87

Hi, @anakin87 I replayed the changes from main.

I still have to do work on the documentation of this PR. Before we dive into the implementation details though, could we please discuss the general design direction first.

Do you have thoughts on my Notes for the reviewer? Is it the best design direction to add invocation layers for different models to which #5205 does not apply because they are not standardized HF models?

Jul 03 '23 22:07 malte-aws

Pull Request Test Coverage Report for Build 5449022368

0 of 0 changed or added relevant lines in 0 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage decreased (-0.04%) to 43.876%

Totals
Change from base Build 5444867374:	-0.04%
Covered Lines:	10149
Relevant Lines:	23131

💛 - Coveralls

Jul 04 '23 07:07 coveralls

@malte-aws, thanks for these contributions. There seem to be several topics for discussion here. We could continue discussing here or arrange a quick sync call where @anakin87, you and I can debate options and agree on the best direction forward. Let us know what you would prefer.

Jul 04 '23 08:07 vblagoje

@malte-aws, thanks for these contributions. There seem to be several topics for discussion here. We could continue discussing here or arrange a quick sync call where @anakin87, you and I can debate options and agree on the best direction forward. Let us know what you would prefer.

Hi @vblagoje, Sounds good. I am available for a call today any time after 6 PM UTC or tomorrow after 3 PM UTC.

Jul 04 '23 08:07 malte-aws

@malte-aws, thanks for these contributions. There seem to be several topics for discussion here. We could continue discussing here or arrange a quick sync call where @anakin87, you and I can debate options and agree on the best direction forward. Let us know what you would prefer.

Hi @vblagoje, Sounds good. I am available for a call today any time after 6 PM UTC or tomorrow after 3 PM UTC.

Yes, let's go for 3pm UTC tomorrow @malte-aws . What's the best way to send you the meeting details?

Jul 04 '23 14:07 vblagoje

@malte-aws, thanks for these contributions. There seem to be several topics for discussion here. We could continue discussing here or arrange a quick sync call where @anakin87, you and I can debate options and agree on the best direction forward. Let us know what you would prefer.

Hi @vblagoje, Sounds good. I am available for a call today any time after 6 PM UTC or tomorrow after 3 PM UTC.

Yes, let's go for 3pm UTC tomorrow @malte-aws . What's the best way to send you the meeting details?

I have sent you an email.

Jul 05 '23 08:07 malte-aws

Hi @anakin87,

How are you?

I started to implement the following changes that we discussed in the sync:

Consolidate invocation layers for AI21 models on Amazon SageMaker into one class so that the algorithm that finds the correct invocation layer does not need to iterate through 7 invocation layers alone for the different AI21 model APIs.
I also removed the parameter filter and now the invocation layer sends all the parameters that a developer puts into the request to the inference endpoint.

Could you please review briefly if this implementation is the way we want to go forward. If you give me a green light then I will finish & test the implementation for the other AI21 models available.

Kind regards Malte

Jul 10 '23 17:07 malte-aws

Hello, @malte-aws,

me and @vblagoje took a look at the recent changes and they seem to be going in the right direction.

(I'm still not 100% sure about some of the implementation details.)

But I think you can go ahead, add the tests, and then rely on the CI (mypy, existing tests...) and our future reviews to further validate the approach!

Jul 12 '23 13:07 anakin87

Closing for inactivity, please feel free to re-open if you resume this work.

Aug 17 '23 13:08 masci

haystack haystack copied to clipboard

feat: Amazon SageMaker expand inference input format

Related Issues

Proposed Changes:

How to use it?

How did you test it?

Notes for the reviewer

Checklist

Pull Request Test Coverage Report for Build 5449022368

💛 - Coveralls

haystack
haystack copied to clipboard