nexa serve always using cpu instead of cpu
Nexa serve is always doing the infer by cpu.
I have tested with the deepseek ocr model.
with infer in the cli, everything is fine, when calling it with nexa serve --host 0.0.0.0:8000 is always using cpu
is there a fix for this?
NexaSDK Bridge Version: v1.0.31 NexaSDK CLI Version: v0.2.60
Hi, thanks for your feedback, Do you set the ngl as 0 in your request? Please try with a none-zero value, for example:
curl -X 'POST' \
'http://localhost:8000/v1/chat/completions' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"model": "Qwen/Qwen3-1.7B-GGUF",
"messages": [
{
"role": "user",
"content": "What is the weather like in Boston today?"
}
],
"nctx": 4096,
"max_completion_tokens": 2048,
"ngl": 999,
"image_max_length": 512
}'
Hello, sorry for the late reply.
this is my current payload
const payload = { model: 'NexaAI/DeepSeek-OCR-GGUF',
messages: [
{
role: 'user',
content: [
{
type: 'text',
text: 'free ocr'
},
{
type: 'image_url',
image_url: {
url: `data:image/png;base64,${base64Image}`
}
}
]
}
],
ngl:999,
"temperature": 0.7,
stream: false,
"nctx": 4096,
};
nexa serve is running on windows. the ngl parameter looks fixed the issue.
New issue is that performing ocr on the same document, i get vastly different results nexa cli vs nexa serve. same document i get hallucination with nexa serve, while nexa cli is always creating good ocr results on same documents
I cant understand why i get this differences
I am using a RX 7900XTX for running deepseek ocr.
Ok just make it work for me plz
Keith cox
On Wed, Nov 26, 2025 at 5:03 PM Parotek @.***> wrote:
parotech123 left a comment (NexaAI/nexa-sdk#885) https://github.com/NexaAI/nexa-sdk/issues/885#issuecomment-3583345759
Hello, sorry for the late reply.
this is my current payload
const payload = { model: 'NexaAI/DeepSeek-OCR-GGUF',
messages: [ { role: 'user', content: [ { type: 'text',
text: 'free ocr' }, { type: 'image_url', image_url: { url: `data:image/png;base64,${base64Image}` } } ] }], ngl:999,
"temperature": 0.7, stream: false, "nctx": 4096, };
nexa serve is running on windows. the ngl parameter looks fixed the issue.
New issue is that performing ocr on the same document, i get vastly different results nexa cli vs nexa serve. same document i get hallucination with nexa serve, while nexa cli is always creating good ocr results on same documents
I cant understand why i get this differences
I am using a RX 7900XTX for running deepseek ocr.
— Reply to this email directly, view it on GitHub https://github.com/NexaAI/nexa-sdk/issues/885#issuecomment-3583345759, or unsubscribe https://github.com/notifications/unsubscribe-auth/BN5OQI7XYKWFYAZTWSXBFK336YPR5AVCNFSM6AAAAACM4MR5OKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTKOBTGM2DKNZVHE . You are receiving this because you are subscribed to this thread.Message ID: @.***>