Feature Request: Please also support Ollama models
I'd like to run it locally.
Thanks in advance,
Thanks for the suggestion! I agree this would be useful. However, support for new LLMs is not a priority at the moment, because there's still a lot of core work to be done (leveraging current LLM endpoints), and we are optimizing the system to work as well as possible with OpenAI models. But I'll leave the issue here as a feature request to keep it in our radar for the time being.
How are we going to experiment with this when we have to pay just to tinker? Feels like this should be a priority
It should be easy to implement with python-dotenv and an .env file with just two lines
OPENAI_BASE_URL=http://localhost:11434/v1 OPENAI_API_KEY=fake-key
Ollama provides an OpenAI compatible API on the local computer, the calls are simply directed there.
Best regards,
It should be easy to implement with python-dotenv and an .env file with just two lines
OPENAI_BASE_URL=http://localhost:11434/v1 OPENAI_API_KEY=fake-key
Ollama provides an OpenAI compatible API on the local computer, the calls are simply directed there.
Best regards,
its a quick fix you can edit, I did the following
Added the following to the config file. END_POINT = http://127.0.0.1:1234/v1/ API_KEY = lm-studio
and Updated openai_utils.py in the site packages or the original file before the install with
self.client = OpenAI(api_key=config["OpenAI"]["API_KEY"], base_url=config["OpenAI"]["END_POINT"])#os.getenv("OPENAI_API_KEY"))
So far this has worked for the base examples, I am still testing a more complex example that uses all the features.
Edit*** Using the simple chat example the extractor does not seem to work and triggers an unknown error, not sure the cause. I might wait for official support or if I have free time i might look into it. ¯_(ツ)_/¯
Terminal: 2024-11-14 17:07:03,439 - tinytroupe - ERROR - openai_utils.py:216 - [1] Invalid request error, won't retry: Error code: 400 - {'error': '<LM Studio error> Unknown exception during inferencing.. Error Data: n/a, Additional Data: n/a'}
LMStudio server: 2024-11-14 17:07:03 [ERROR] Unknown exception during inferencing.. Error Data: n/a, Additional Data: n/a
Thx @Katzukum
It seems to me, when working with local LLMs, you might want to comment out def _count_tokens since it’s no longer necessary and can interfere with the models in LM Studio (though llama 3.2 works fine, but had a problem with aya-expanse).
Also, if you’re using a resource-intensive model, consider setting TIMEOUT=100 or more in config.ini!
Update: we actually have an ongoing PR proposal that would enable this: https://github.com/microsoft/TinyTroupe/pull/47
As these are complex changes, it might take some time to review, but the fact that someone already did a lot of heavy lifting here helps, and I'll try to review it with care.
See https://github.com/microsoft/TinyTroupe/discussions/99
TL;DR
import os
os.environ["OPENAI_API_KEY"] = "ollama"
os.environ["OPENAI_BASE_URL"] = "http://localhost:11434/v1"
great to hear, thank you, I'll try it out.
Folks, I finally reviewed the PR, tweaked it a bit and merged. I've put it in the development branch. It is very experimental though. I tried running with gemma3:1b (which is the best I could fit in my overworked machine) and while it did run, the inferences were not great. So further tweaking seems necessary to make it useful. But at least we can call the Ollama local endpoint now and get the completion.
Thanks everyone. If you want to tweak things from there, please share your findings.