cursor Implement Local AI Integration with Ollama for Offline AI Assistance

Description: We need to enhance Cursor IDE by implementing support for local AI models using Ollama, similar to the Continue extension for VS Code. This will enable developers to use AI-powered code assistance offline, ensuring privacy and reducing dependency on external APIs.

1. Ollama Integration:

Integrate Ollama into the Cursor IDE to run AI models locally. This should include the ability to configure the path to Ollama and select specific models for different coding tasks.
Ensure that the local AI models are available for features like code completion, refactoring, and contextual code understanding.

2. Model Support:

Provide compatibility with a range of models available through Ollama, such as Llama 3 and Starcoder 2, which offer support for Fill-the-Middle (FIM) predictions and embeddings.
Allow users to customize which models are used for specific tasks (e.g., code completion, embedding generation). Configuration Options:

Add options in Cursor’s settings to configure and manage local AI models. This should include the ability to switch between different AI providers, like Ollama and any cloud-based alternatives. Implement a configuration UI that allows users to easily select and manage their local AI setups. Performance and Usability:

Optimize the interaction between Cursor and the local AI models to minimize latency and resource usage. Ensure that the local AI features are as seamless and user-friendly as their cloud-based counterparts, with clear feedback on model performance and any potential issues.

Sep 04 '24 00:09 ThalesAugusto0

Considering the size and effectiveness of the local model and the commercialization of the Cursor product, the likelihood of this proposal coming to fruition is quite small. 😂

Sep 05 '24 02:09 Air1996

Considering the size and effectiveness of the local model and the commercialization of the Cursor product, the likelihood of this proposal coming to fruition is quite small. 😂

The company doesn't need to do this, since the code is open, why can't we, the development community, do this?

Sep 05 '24 21:09 ThalesAugusto0

Cursor is not open source. This is an issues only repo.

Sep 21 '24 09:09 vertis

Upping this anyway. The company can still monetize with the thousands of devs that do not have a powerful GPU.

Sep 24 '24 10:09 tcsenpai

Check this : https://github.com/getcursor/cursor/issues/1380#issuecomment-2371534354

Sep 24 '24 14:09 Mateleo

Check this : #1380 (comment)

The devs broke that as well (likely on purpose), they're in for the money and don't care about you and me

Oct 03 '24 09:10 spergware

Check this : #1380 (comment)

The devs broke that as well (likely on purpose), they're in for the money and don't care about you and me

For me it's working perfectly fine, using ollama + ngrok. I use the latest version of cursor

Oct 03 '24 09:10 Mateleo

Check this : #1380 (comment)

The devs broke that as well (likely on purpose), they're in for the money and don't care about you and me

I doubt the devs would do such an easily visible thing like breaking support specifically for ollama (as here we are anyway using an OpenAI endpoint, so is pretty generic). Anyway, the loss of quality using 8b models this way is not worth saving 20 bucks per month. They are not in danger.

Oct 03 '24 10:10 tcsenpai

Check this : #1380 (comment)

The devs broke that as well (likely on purpose), they're in for the money and don't care about you and me

For me it's working perfectly fine, using ollama + ngrok. I use the latest version of cursor

I applied your workaround properly but I keep on getting error 403 from ngrok like many other people, do I need to forward some port or?

Oct 03 '24 11:10 spergware

Hi team, qwen 2.5-coder 32B is here and it's rad. Can anybody give hopes re implementing of ollama support? Frankly, it's game-changing/deal-breaking for many including me. I found 7 open issues about implementing local llm in this repo

Thanks! 🙏🙏🙏

Nov 14 '24 01:11 astr0gator

Hi team, qwen 2.5-coder 32B is here and it's rad. Can anybody give hopes re implementing of ollama support? Frankly, it's game-changing/deal-breaking for many including me. I found 7 open issues about implementing local llm in this repo

Thanks! 🙏🙏🙏

It works fine for me, just use ngrok and point it to your ollama ip, for example mine runs on my network at 192.168.1.3:1147 so I point ngrok there on that same computer.

On my cursor installation under cursor settings I turn off all other models, select use custom openapi endpoint and throw a random api key in and set the openapi url to

<ngrokurl>/v1

Here's a screenshot of my setup

I'm also on windows with both boxes. What I'm trying to figure out honestly is ensuring I'm using maximum context, ollama has a nonstandard api for it.

Nov 14 '24 20:11 loktar00

Hi team, qwen 2.5-coder 32B is here and it's rad. Can anybody give hopes re implementing of ollama support? Frankly, it's game-changing/deal-breaking for many including me. I found 7 open issues about implementing local llm in this repo Thanks! 🙏🙏🙏

It works fine for me, just use ngrok and point it to your ollama ip, for example mine runs on my network at 192.168.1.3:1147 so I point ngrok there on that same computer.

On my cursor installation under cursor settings I turn off all other models, select use custom openapi endpoint and throw a random api key in and set the openapi url to

<ngrokurl>/v1

Here's a screenshot of my setup [image]

I'm also on windows with both boxes. What I'm trying to figure out honestly is ensuring I'm using maximum context, ollama has a nonstandard api for it.

I am unable to get it to work.

I disabled all the other models so it would authenticate my qwen model when I would click "Verify".

I then see it try to verify and fail. Cursor asks me to run curl https://[MY_HASH].ngrok-free.app/v1/chat/completions -H "Content-Type: application/json" -H "Authorization: Bearer [AUTH_TOKEN]" -d '{ "messages": [ { "role": "system", "content": "You are a test assistant." }, { "role": "user", "content": "Testing. Just say hi and nothing else." } ], "model": "qwen2.5-coder:14b" }'.

I do this and I receive an HTML response back from Ngrok with the following HTML body; `

`

I see 10:38:29.576 CET OPTIONS /v1/chat/completions 403 Forbidden in my Ngrok console ( After all the updates I was never able to get Ngrok working. It seemed to simply block all my requests and since I want to connect to it locally anyway I focussed on that)

Update: After taking a look at the Ollama API docs it seems that the OPTIONS /chat/completions request is not compatible with the POST /api/chat Ollama endpoint

Update v2: I am able to get it to work in the console but not with the OpenAI API layout

Update v3: According to the Ollama OpenAI compatibility blog you can enable the OpenAI API schema by using /v1 at the end of your URL. I did this and it WORKS in my console, but Cursor throws an error and shows me a curl command that works in my console.

> curl http://localhost:11434/v1/chat/completions -H "Content-Type: application/json" -H "Authorization: Bearer 123" -d '{
  "messages": [
    {
      "role": "system",
      "content": "You are a test assistant."
    },
    {
      "role": "user",
      "content": "Testing. Just say hi and nothing else."
    }
  ],
  "model": "qwen2.5-coder:14b"
}'
{"id":"chatcmpl-821","object":"chat.completion","created":1732011716,"model":"qwen2.5-coder:14b","system_fingerprint":"fp_ollama","choices":[{"index":0,"message":{"role":"assistant","content":"hi"},"finish_reason":"stop"}],"usage":{"prompt_tokens":28,"completion_tokens":2,"total_tokens":30}}

Cursor will not accept this though and keep saying it is an invalid model and my API key doesn't support it. The Continue.dev extension works like plug and play after I edited the config.json file and simply pointed it to qwen

Nov 19 '24 09:11 diadras

Hi team, qwen 2.5-coder 32B is here and it's rad. Can anybody give hopes re implementing of ollama support? Frankly, it's game-changing/deal-breaking for many including me. I found 7 open issues about implementing local llm in this repo Thanks! 🙏🙏🙏

It works fine for me, just use ngrok and point it to your ollama ip, for example mine runs on my network at 192.168.1.3:1147 so I point ngrok there on that same computer. On my cursor installation under cursor settings I turn off all other models, select use custom openapi endpoint and throw a random api key in and set the openapi url to <ngrokurl>/v1 Here's a screenshot of my setup [image] I'm also on windows with both boxes. What I'm trying to figure out honestly is ensuring I'm using maximum context, ollama has a nonstandard api for it.

I am unable to get it to work.

I disabled all the other models so it would authenticate my qwen model when I would click "Verify".

I then see it try to verify and fail. Cursor asks me to run curl https://[MY_HASH].ngrok-free.app/v1/chat/completions -H "Content-Type: application/json" -H "Authorization: Bearer [AUTH_TOKEN]" -d '{ "messages": [ { "role": "system", "content": "You are a test assistant." }, { "role": "user", "content": "Testing. Just say hi and nothing else." } ], "model": "qwen2.5-coder:14b" }'.

I do this and I receive an HTML response back from Ngrok with the following HTML body; `

I see10:38:29.576 CET OPTIONS /v1/chat/completions 403 Forbidden` in my Ngrok console ( After all the updates I was never able to get Ngrok working. It seemed to simply block all my requests and since I want to connect to it locally anyway I focussed on that)

Update: After taking a look at the Ollama API docs it seems that the OPTIONS /chat/completions request is not compatible with the POST /api/chat Ollama endpoint

Update v2: I am able to get it to work in the console but not with the OpenAI API layout

Update v3: According to the Ollama OpenAI compatibility blog you can enable the OpenAI API schema by using /v1 at the end of your URL. I did this and it WORKS in my console, but Cursor throws an error and shows me a curl command that works in my console.
> curl http://localhost:11434/v1/chat/completions -H "Content-Type: application/json" -H "Authorization: Bearer 123" -d '{
  "messages": [
    {
      "role": "system",
      "content": "You are a test assistant."
    },
    {
      "role": "user",
      "content": "Testing. Just say hi and nothing else."
    }
  ],
  "model": "qwen2.5-coder:14b"
}'
{"id":"chatcmpl-821","object":"chat.completion","created":1732011716,"model":"qwen2.5-coder:14b","system_fingerprint":"fp_ollama","choices":[{"index":0,"message":{"role":"assistant","content":"hi"},"finish_reason":"stop"}],"usage":{"prompt_tokens":28,"completion_tokens":2,"total_tokens":30}}
Cursor will not accept this though and keep saying it is an invalid model and my API key doesn't support it. The Continue.dev extension works like plug and play after I edited the config.json file and simply pointed it to qwen

Hmm can you share your URL setting, an image from your settings, if you check my image/description I'm mapping ngrok to my ip:port and using the url as the following within Cursor:

https://ngrokwahtever.com/v1

Nov 20 '24 02:11 loktar00

why the ngrok is needed? if i want to point to some server ip i got the same errors...(the CURL command succed, but it fail to get this model)

Dec 07 '24 00:12 barshag

@loktar00 Is it still working for you? I just tried, got past verification, but then it doesn't want to even make any requests, see https://github.com/getcursor/cursor/issues/2520

Jan 07 '25 17:01 arty-hlr

i think the ngrok is unsafe. is it possible to directly give specific server ip

Jan 09 '25 09:01 songqiangchina

Still not possible

Jan 09 '25 09:01 arty-hlr

@loktar00 Is it still working for you? I just tried, got past verification, but then it doesn't want to even make any requests, see #2520

I'll have to check, I switched back to just using Claude since my qwen machine is dedicated to other tasks.

Jan 13 '25 04:01 loktar00