deepseek r1 (with tool calling)
Deepseek R1 is both high quality and very affordable. Can you please add support.
@rabble not sure what you mean by deepseek api? - there are many places that host (which can use - eg openrouter for one) as well as ollama (with some templating: https://ollama.com/michaelneale/deepseek-r1-goose - but very PoC).
Deepseek r1 also doesn't do tool calling yet (at time of writing), so has been used for chat apps, not agents (which usually use tool calling. Of course things change fast - but as of now it isn't quite ready.
and moving fast!
just pushed the 70b on: https://ollama.com/michaelneale/deepseek-r1-goose:70b - which if you have hardware to run it, works a lot better (very very slow for me on 64G macbook unified ram - so ideally want more than that) - but does mean you can run it all locally if you can.
i only just started messing with goose tonight -- love it so far -- but i've had issues trying to use deepseek as well. probably user error, but whether i point it at openrouter or ollama i get the following:
◐ Nesting functions carefully... 2025-01-29T07:26:50.513915Z ERROR goose::agents::truncate: Error: Request failed: Request failed with status: 400 Bad Request
at crates/goose/src/agents/truncate.rs:256
Ran into this error: Request failed: Request failed with status: 400 Bad Request.
Please retry if you think this is a transient or recoverable error.
which is so strange because i got a response from deepseek goose on the config change:
~/Development 9m49s
❯ goose configure
This will update your existing config file
if you prefer, you can edit it directly at /Users/phaedrus/.config/goose/config.yaml
┌ goose-configure
│
◇ What would you like to configure?
│ Configure Providers
│
◇ Which model provider should we use?
│ Ollama
│
◇ Enter a model from that provider:
│ deepseek-r1
│
◇ <think>
Okay, so the user asked me to act as an AI agent named "Goose." They want me to provide a nice welcome message followed by letting them know everything is set up for their interactions.
Alright, first, I need to come up with a friendly and welcoming sentence. Since Goose is the name here, maybe start with something like, "Welcome! I'm Goose, your connected extension agent..." That sounds good.
Next, I need to inform the user that they're all set to use the agent. Probably just say, "ready to assist you!" It's straightforward and positive.
Wait, let me make sure it's one sentence for the welcome message and then a follow-up. So after starting with a greeting, I'll move on to confirm everything is ready.
I should keep the tone cheerful and approachable. Maybe use emojis too? The initial example used 🐑✨ so that might be appropriate.
Putting it all together: "Welcome! I'm Goose, your connected extension agent..., ready to assist you!"
Hmm, actually in the response they sent me a single message with both parts combined as one sentence. So maybe do the same for consistency. Let me structure that.
</think>
Hello! 🐑✨ I’m Goose, your connected extensions agent—all set and ready to assist you anytime!
│
└ Configuration saved successfully
oh that is a good point @baxen - the config check should check for tool calls to know it is compatible?
@phrazzld if you use this versoin of r1: https://ollama.com/michaelneale/deepseek-r1-goose:70b - it can do tool calling (you need a fair bit of memory however!)
@phrazzld yeah - when it does the check it doesn't check enough (ie if it has the developer MCP system - which is on by default, it requires tool call support, which you need to use that other variant of the model for) - hopefully we can fix that! thanks for taking a look.
As mic mention, Goose relies on tool calling capabilities. AFAIK DeepSeek R1 doesn't provide tool calling. This is mentioned on DeepSeek's website
there's a few ollama mods of deepseek that try to do tool calling with a template which works okay but not as good as claude 3.5 sonnet or gpt-4o's native tool calling. example: mic's version - https://ollama.com/michaelneale/deepseek-r1-goose:70b, https://ollama.com/MFDoom/deepseek-r1-tool-calling
none of the current vllm tool parsers do a good job either (i tried llama3_json, granite). there's also this open issue in vllm - https://github.com/vllm-project/vllm/issues/12297.
Maybe a dumb question, but... What kind of specs should I be running on a cloud box such that the 70B model runs with reasonable performance?
Even for @michaelneale 's/deepseek-r1-goose:70b is the following expected? If not, is there another ollama model that one could test with?
praateek@machine:~/goose$ goose configure
This will update your existing config file
if you prefer, you can edit it directly at ~/.config/goose/config.yaml
┌ goose-configure
│
◇ What would you like to configure?
│ Configure Providers
│
◇ Which model provider should we use?
│ Ollama
│
◇ Enter a model from that provider:
│ michaelneale/deepseek-r1-goose:70b
│
◇ Okay, so I need to figure out how to create a nice welcome message that's just one sentence to let the user know they're all set to use this AI agent called Goose. Alright, where do I start?
First, I should understand what exactly is needed here. The main goal is to craft a single sentence that serves as both a greeting and an affirmation that everything is ready for the user to proceed. It should be friendly, concise, and clear.
Let me think about the key components of such a message:
1. **Welcome Greeting:** Maybe something like "Hello," "Hi there," or "Welcome."
2. **Confirmation They're Set:** Phrases like "You're all set," "Everything is ready," or "Ready to assist you."
I want it to sound warm and inviting, so let's explore some possibilities.
Option 1: "Hello! You're all set to use Goose. Just let me know how I can assist you!" Wait, but that's two sentences. The user asked for one sentence. So I need to combine these into a single line without making it too long.
Option 2: "Welcome! You're ready to start using Goose—what can I do for you?" Hmm, still a bit long and two parts connected by an em dash. Maybe that's acceptable, but perhaps there's a smoother way.
Wait, the user example provided was: "Hello! You're all set to use this agent—just let me know how you'd like to proceed!" That fits into one sentence with an em dash. I think that works well because it's concise and flows naturally.
Alternatively, maybe something even more friendly, like "Hi there! You're all set to go with Goose—how can I help today?" Hmm, that's again two thoughts but in one sentence. Wait, no, that would require a period instead of an em dash if separated into two independent clauses.
Wait, no: The example uses an em dash which allows it to be one sentence while including both parts. So perhaps the structure is key here.
Let me try another approach: Start with a greeting, then smoothly transition into letting them know they're set and ready to go. Maybe something like "Welcome! Everything is ready for you to use Goose."
But that feels a bit formal. Alternatively, "Hi there! You're all set to start using Goose right away." That's one sentence but perhaps a bit lengthy.
Wait, the user-provided example seems effective because it includes both a greeting and an invitation to proceed in a single line with an em dash. Maybe I should follow that structure.
So, combining the welcome with the confirmation and invitation:
"Hello! You're all set to use this agent—just let me know how you'd like to proceed!"
That works well because it's friendly, concise, and uses the em dash to connect the two parts without making it run-on. It's one sentence, technically, even though it has a pause in the middle.
Is there a way to make it more enthusiastic? Maybe adding an exclamation mark after "Hello!" but I think that's already there. Alternatively, using a different phrase like "You're ready to go!"
"Hi! You're all set—you can start using Goose anytime now."
Hmm, that might be better as two sentences: "Hi! You're all set. Feel free to use Goose whenever you need assistance." But the user wants one sentence.
Wait, perhaps "Welcome aboard! Everything is ready for you to use Goose." That's another option but feels a bit nautical with "aboard."
Alternatively, keeping it simple and straightforward is probably best to maintain clarity: "Hello! You're all set to use this agent—just let me know how I can help!"
Alternatively, using "Let me know what you need" instead of "how you'd like to proceed," but both are good.
I think the key here is to make it warm, inviting, and clear that assistance is available. Since the example provided by the user does this well, maybe sticking close to that structure is best.
</think>
Hello! You're all set to use Goose—just let me know how you'd like to proceed!
yes that is expected. The default qwen one can work (limited). but really depends how much hardware.
(at least the blurb is). Working on tunning it a bit with a smaller one
OpenRouter has the 671b version Azure just released. It's free to use for now, I could get it to work by updating goose.
the model name is deepseek/deepseek-r1:free
I just updated goose (I'm on version 1.0.0 now ) and tried to use deepseek/deepseek-r1:free and got this error:
ERROR goose::agents::truncate: Error: Request failed: Request failed with status: 404 Not Found at crates/goose/src/agents/truncate.rs:256
I get the same error trying to use any variation of o1 or r1 through OpenRouter, is the issue here that tool calling is not supported with these reasoning models?
yes, no tool calling so no agent - you have to use one that has a template - not sur ehow to do that with openrouter yet. Someone could make a deepseek specific one that layers on some system prompts to try to convince it to to tool calling?
@joakim-roos with the azure one - was it able to run command/edit files etc?
@praateekmahajan the 70b one is updated - works a bit better now if you have the hardware for it
I can't test the openrouter one right now - they seem to be dropping connections (404) (I can use claude however with them)
going to pause this for a moment - as I think there will be fine tunes of deepseek soon that do tool calling.
@michaelneale the 70b one, i.e https://ollama.com/michaelneale/deepseek-r1-goose:70b ? It says Updated "yesterday", so wanted to ensure.
I do have access to hardware, I was able to use it with 1 A100, but observed slow throughput (outside of the funky responses)
going to pause this for a moment
Yup fair enough, hoping one comes out soon; the docs do mention it. I tried mistral-small / llama3.3, which are able to perform one or two tasks, but does get stuck where it starts chatting about which shell commands to run instead of actually running them
@praateekmahajan oh tha tis good to know - BTW have had a few other experiments, this looks promising: https://github.com/block/goose/pull/975 - while not ollama - if it works, approach could expand to others. we should keep trying a bit I think as it looks like it can almost do it.
@praateekmahajan for the other deepseek models - how do you find them? r1 seems to have all the attention so makes sense to put work into it but other ones do you find good?
It's also not working for me with Ollama and deepseek-r1:7b... I tried it with other providers too like with Roo Code and Continue...
@michaelneale I haven't tried any of the deepseek models except the one that are shared by you https://ollama.com/michaelneale/deepseek-r1-goose
I can follow #975 but I'm curious what is a good "unit test" for such a thing where we can confirm it "almost" works (or it doesn't). For me, for instance my system doesn't have ripgrep, and if I ask goose to find xyz.py and explain to me what it does, any non-claude/o1 model gets stuck in ripgrep attempt and then replies typically with "you should use find to find files if ripgrep is not available", and some sort of loop like that.
@praateekmahajan interesting - yeah 70B I have barely been able to test due to slowness (and every provider seems swamped I try), but yes that wouldn't surprise me. When it worked the full 600B+ parameter one (via openrouter) seemed smarter (naturally) and likely wouldn't have that problem (as similar class to claude/o1/gpt-4o size).
Yeah good Q on unit tests, there are bench marks and happy path tests etc - but really the permutations make it really hard as it is fundamentally fuzzy to compare that something is better than others (not that people don't try with benchmarks for text generation models) - I think it will take time. For now its heavily qualitative at least for me.
if you have ollama installed already - you can take that Modelfile and push your own variant of the template to tweak the model a bit, but likely would effectively be overfitting to your environment.
I'll continue trying a few approaches as new ways come out to try things. I think however 7b models won't ever work - we have to at least be 14B or above at this time?
ok given this a try: https://www.npmjs.com/package/deepseek-reasoner-mcp - that combined with default model for ollama, can both work together (and r1 doesn't have to do tool calling) - can run locally.
I tried using deepseek-r1-goose-14b, which should have tools integrated.
It manages to trigger them, e.g. if I ask to write a file to my home directory it correctly invokes the computercontroller, but that's pretty much it. It just proceeds to output a gargantuan amount of instructions but actually executes none of them.
I also don't understand why all Deepseek models keep outputting their
Anyone had any luck with this?
I tried using deepseek-r1-goose-14b, which should have tools integrated.
It manages to trigger them, e.g. if I ask to write a file to my home directory it correctly invokes the
computercontroller, but that's pretty much it. It just proceeds to output a gargantuan amount of instructions but actually executes none of them.I also don't understand why all Deepseek models keep outputting their prompts instead of just the final output.
Anyone had any luck with this?
Do you think it's possible to use it with Browser Use instead of Goose?
The way I see it, with deepseek-r1, a temperature around 0.6 is best for reasoning capacities, but a temperature this high is bad for json formatting, and therefore tool calling.
I think the temperature should be around 0.6 while the model is reasoning, and then, after the </ think > tag, it should drop down to 0.1 or even 0.0. I have no idea to do that in one call tho. If someone wants me to try this theory with two different calls at two different temperatures, I can try
Update: I don't really know how, but it's working now.
So, to sum it up, using deepseek-r1-goose-14b actually works with tool calling.
FYI - there are now several deepkseek-r1 models with tool support: https://ollama.com/search?c=tools&q=deepseek-r1 (caveat: had not tried any of them yet)
Yes i think this is working with several different versions now available with tool calling patches! We are separately working on improving performance with smaller open models that will also help here, but its a different enough topic that we'll track via another issue or PR.