turnkeyml icon indicating copy to clipboard operation
turnkeyml copied to clipboard

Add Linux NPU & GPU support to Lemonade Server

Open jeremyfowers opened this issue 10 months ago • 4 comments

Adding a discussion issue here to get feedback from the community about what they would use Linux support for in conjunction with Lemonade Server.

Something that would help would be if people would comment with what their use case is, what hardware they are running, what models they are interested in, etc. Having this written here would give us some concrete targets to go after.

jeremyfowers avatar Apr 08 '25 21:04 jeremyfowers

At this moment, I'm using a ryzen cpu, 7900xtx, Ubuntu 24LTS, ROCm 6.3.3.60303-74~24.04, and whatever the latest versions of ollama and openwebui are. Docker does a nice job of keeping my machine from getting to cluttered.

models used? whatever seems interesting/useful for text processing (eg. code generation, document summaries, etc.), is easily imported into ollama, and fits in VRAM, eg.

  • codegemma
  • codellama
  • codestral
  • deepseek-r1
  • deepcoder
  • starcoder
  • qwen-2.5-coder
  • cogito
  • phi4
  • falcon3
  • gemma3
  • granite3.2
  • granite-code
  • dolphin-mixtral

ckuethe avatar Apr 11 '25 02:04 ckuethe

OpenAI Whisper - voice recognition model - not a particular heavy workload, I would love a Linux based local voice assistant that doesn't draw a lot of power.

bogdanbiv avatar Apr 19 '25 12:04 bogdanbiv

Native NPU support in Linux would be a game changer for us.

  • https://github.com/EricLBuehler/mistral.rs/issues/1254
  • https://github.com/amd/gaia/issues/9
  • https://github.com/ollama/ollama/issues/5186
  • https://github.com/AMD-AIG-AIMA/Instella/issues/1
  • https://github.com/ggml-org/llama.cpp/issues/1499

GreyXor avatar Apr 20 '25 22:04 GreyXor

I am currently working on an SME AI appliance and am exploring various hardware options, with the AMD AI 300 APU series being a hot contender. However, the prerequisite for this would be optimal LLM inference performance, as demonstrated by the OGA hybrid workflows.

planatscher avatar Apr 25 '25 10:04 planatscher

FYI folks: Lemonade SDK has moved to a new repository, and I have re-opened this issue there: https://github.com/lemonade-sdk/lemonade/issues/5

jeremyfowers avatar May 16 '25 20:05 jeremyfowers