dspy
dspy copied to clipboard
Feature mistralai api
I added support for Mistral AI's API. Thanks to this new feature, users will be able to play with the following models easily: mistral-small-latest, mistral-medium-latest, and mistral-large-latest (a GPT-4 level model).
Here is an example of usage:
import dspy
lm = dspy.Mistral(api_key="your-mistralai-api-key")
dspy.settings.configure(lm=lm)
mod = dspy.Predict("question -> answer")
print(mod(question="Who is Emmanuel Macron?"))
It'd be great if you could also add md file to this folder. https://github.com/stanfordnlp/dspy/tree/main/docs/api/language_model_clients
Sure I will do that ! Thanks for the feedback !
Done @insop !
Thank you @fsndzomga . Python code looks good to me; however I am not so sure about other files.
@fsndzomga can you change the link in docs to api/language_model_clients/Mistral
@krypticmouse , I am not sure I understand. Isn't it already the case ?
@fsndzomga yes but you just need to specify the path and remove the base url from the link. So you can remove the https://dspy-docs.vercel.app
Okay, I get it. Done ! Thanks for the feedback.
Hi @fsndzomga , thanks for the addition!
We want to keep all external imports/dependencies local to the file requirements.
@arnavsinghvi11 @okhat , I took into account all the feedbacks. It should be all good now.
Tested, I'm not sure if it really works. Mistral API models are chat completion models, and so they respond with repeating the task when using Predict (but not Chain of Thought which gives good results)
sentence = "it's a charming and often affecting journey." # example from the SST-2 dataset.
# 1) Declare with a signature.
classify = dspy.Predict('sentence -> sentiment')
# 2) Call with input argument(s).
response = classify(sentence=sentence)
# 3) Access the output.
print(response.sentiment)
"Sentence: it's a charming and often affecting journey.\nSentiment: Positive\n\nNote: The sentiment is inferred as positive based on the use of words such as 'charming' and 'affecting' which convey a positive emotion or experience. However, please note that sentiment analysis is not always 100% accurate and can be subjective."
Thanks for the feedback @axelpey. To be clear, what you are saying is that it works, but the format of the response is sometimes not the one you expect. Right? Mistral models have a tendency to change their behavior, as you have mentioned. For instance, what you mentioned has not been happening over the past few weeks when I was testing. It depends on the model you use; even the same model can provide responses in different formats. I tested your code and got this: "Sentiment: Positive" which is more concise.
Since DSPy does not handle response formats per se, on way to achieve your desired goal is to use TypedPredictors or simply use a mistral-large for example.
A global solution would be to add a custom parser specifically for Mistral AI, and I am not sure if it is in line with DSPy approach.
Hmm which one did you use? I used medium. And I think this problem is not limited to mistral, it's for all models that are ChatCompletion models and not just standard Completion. I think it's still very relevant to have Mistral API avail in dspy, just we need to be mindful of it. @okhat I think you've seen these issues before with other ChatCompletion too right?
Maybe I am missing something, but we do use the chat completions API from OpenAI right ? Below a screenshot of the gpt3.py file.
Anyway, if you guys have a better idea how to handle it, let me know, or we can just drop this feature.
@fsndzomga @axelpey Thanks for the discussion on the limitation with ChatCompletion models. This is indeed something we've seen come across with running chat models with DSPy. There is some more discussion in #662 and #420 as we have some ongoing work to handle this, but this PR is good to merge for now! Thanks @fsndzomga for the contributions!
Oh so even the OpenAI integration has the same problem... I really wasn't aware of that. Thanks for sharing the links @arnavsinghvi11 and thanks for merging the PR !