frob

Results 843 comments of frob

What does the following return: ``` curl localhost:11434 ```

``` Ollama is running ``` ollama is working. If your app doesn't, that's a problem with the app or your proxy configuration. Try setting ``` os.environ["no_proxy"] = "127.0.0.1,localhost" ```

Then you need to figure out why your proxy is not routing traffic to 127.0.0.1:11434 to 127.0.0.1:11434. ollama doesn't need a proxy, if your app does then it's not an...

> Does Ollama require users to configure a proxy? No. If you do have a proxy, you need to configure it to allow clients to connect to the ollama port.

You have no GPU accelerator and the 920 apparently doesn't implement the ARM matrix extensions (SME) so you are relying on brute force CPU. For LLM inference workloads, it's just...

Suggestions for how to increase token generation? Run with GPU acceleration or faster hardware. You can try running other inference engines on the KunPeng and see if they perform better,...

You can try running other inference engines on the KunPeng and see if they perform better, and even run them on different hardware platforms as a comparison. That might give...

If your input tokens + output tokens > num_ctx, the model will fail due to k-shift. So if you want to use longer prompts (multiple Q&A), you need to increase...

Increase `num_ctx` so that it's big enough to hold the input tokens and the output tokens. You still have to control the users input.