Michael Yang
Michael Yang
I'm not sure what you mean. The pull example uses streaming to receive progress updates. Each response returns the progress (current and total) in bytes for the current layer being...
Related to #4028. I'm going to close this one as duplicate since #4028 captures the mixtral:8x22b of which wizardlm:8x22b is a fine tune.
Can you describe step by step what you're doing? I'm not able to reproduce this with the latest example, ollama-python, and ollama
There's a misunderstanding here. `context` is intended not for embedding outputs. It's intended for the token output of previous conversations so as to continue the previous response
This should be fixed now (at least >=0.1.20 but maybe slightly older) where creating a model (including adapters!) should work with remote servers using local files. Please reopen the issue...
Using a browser to reach `/api/generate` will not work because there is no `GET /api/generate`. Please use curl, postman, etc. which allows you to call `POST /api/generate` with JSON data
@chymian would you be able to retest with the latest (0.1.20) ollama? There's been significant improvements to model running including for multi-GPU. ``` $ ollama run nexusraven why is the...
It looks like you're building and running this on Apple Silicon. With `--platform linux/amd64` it's possible it's using Rosetta. The Linux build currently enables AVX which isn't supported on Rosetta...
Hi @Shihab-Shahriar, can you describe briefly how this server was installed? Is it using a base image from a cloud provider (AWS, GCP, etc.), installed using Ubuntu's ISO, or a...
You shouldn't need to install anything in order to run the install script however I need to better understand how the system is setup to reproduce this issue. For example,...