Add support for AMD Ryzen Strix Point Halo and Strix Halo APU's
Is your feature request related to a problem? Please describe. The AMD Ryzen AI Max+ 395 APU (aka Strix Halo) doesn't seem to be currently supported. EDIT: I noticed that gfx1150 which corresponds to the Strix Point Halo series of APU is also not included in the list of AMDGPU target codes.
Describe the solution you'd like I believe that the GPU code (gfx1151 and gfx1150) for Strix Halo and Strix Point Halo just need to be added to the list of AMDGPU targets .
I also tried (with the same CPU/GPU) build the container with this instruction using this vars
Environment=HSA_OVERRIDE_GFX_VERSION=11.5.1
Environment=ROCR_VISIBLE_DEVICES=0
Environment=PORT=8080
Environment=DEBUG=true
Environment=REBUILD=true
Environment=BUILD_TYPE=hipblas
Environment=GPU_TARGETS=gfx1151
Environment=LOCALAI_FORCE_META_BACKEND_CAPABILITY=amd
but in the logs the gpus is found but not used:
===> LocalAI All-in-One (AIO) container starting...
GPU acceleration is not enabled or supported. Defaulting to CPU.
===> Starting LocalAI[cpu] with the following models: /aio/cpu/embeddings.yaml,/aio/cpu/rerank.yaml,/aio/cpu/text-to-speech.yaml,/aio/cpu/image-gen.yaml,/aio/cpu/text-to-text.yaml,/aio/cpu/speech-to-text.yaml,/aio/cpu/vad.yaml,/aio/cpu/vision.yaml
CPU info:
model name : AMD RYZEN AI MAX+ PRO 395 w/ Radeon 8060S
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good amd_lbr_v2 nopl xtopology nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba perfmon_v2 ibrs ibpb stibp ibrs_enhanced vmmcall fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local user_shstk avx_vnni avx512_bf16 clzero irperf xsaveerptr rdpru wbnoinvd cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif x2avic v_spec_ctrl vnmi avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid bus_lock_detect movdiri movdir64b overflow_recov succor smca fsrm avx512_vp2intersect flush_l1d amd_lbr_pmc_freeze
CPU: AVX found OK
CPU: AVX2 found OK
CPU: AVX512 found OK
2025/10/10 14:18:54 failed to create modcache index dir: mkdir /.cache: permission denied
2:18PM DBG Setting logging to debug
2:18PM DBG GPUs gpus=[{"address":"0000:c4:00.0","index":1,"pci":{"address":"0000:c4:00.0","class":{"id":"03","name":"Display controller"},"driver":"amdgpu","product":{"id":"1586","name":"unknown"},"programming_interface":{"id":"00","name":"unknown"},"revision":"0xd1","subclass":{"id":"80","name":"Display controller"},"subsystem":{"id":"8d1d","name":"unknown"},"vendor":{"id":"1002","name":"Advanced Micro Devices, Inc. [AMD/ATI]"}}}]
2:18PM DBG GPU vendor gpuVendor=amd
2:18PM DBG Total available VRAM vram=0
2:18PM INF Starting LocalAI using 16 threads, with models path: //models
2:18PM INF LocalAI version: dc2be93 (dc2be934127bdd795c95ec564d8699c0056960c1)
2:18PM DBG CPU capabilities: [3dnowprefetch abm adx aes amd_lbr_pmc_freeze amd_lbr_v2 aperfmperf apic arat avic avx avx2 avx512_bf16 avx512_bitalg avx512_vbmi2 avx512_vnni avx512_vp2intersect avx512_vpopcntdq avx512bw avx512cd avx512dq avx512f avx512ifma avx512vbmi avx512vl avx_vnni bmi1 bmi2 bpext bus_lock_detect cat_l3 cdp_l3 clflush clflushopt clwb clzero cmov cmp_legacy constant_tsc cpb cppc cpuid cqm cqm_llc cqm_mbm_local cqm_mbm_total cqm_occup_llc cr8_legacy cx16 cx8 de decodeassists erms extapic extd_apicid f16c flush_l1d flushbyasid fma fpu fsgsbase fsrm fxsr fxsr_opt gfni ht hw_pstate ibpb ibrs ibrs_enhanced ibs invpcid irperf lahf_lm lbrv lm mba mca mce misalignsse mmx mmxext monitor movbe movdir64b movdiri msr mtrr mwaitx nonstop_tsc nopl npt nrip_save nx ospke osvw overflow_recov pae pat pausefilter pclmulqdq pdpe1gb perfctr_core perfctr_llc perfctr_nb perfmon_v2 pfthreshold pge pku pni popcnt pse pse36 rapl rdpid rdpru rdrand rdseed rdt_a rdtscp rep_good sep sha_ni skinit smap smca smep ssbd sse sse2 sse4_1 sse4_2 sse4a ssse3 stibp succor svm svm_lock syscall tce topoext tsc tsc_adjust tsc_scale umip user_shstk v_spec_ctrl v_vmsave_vmload vaes vgif vmcb_clean vme vmmcall vnmi vpclmulqdq wbnoinvd wdt x2avic xgetbv1 xsave xsavec xsaveerptr xsaveopt xsaves xtopology]
2:18PM DBG GPU count: 1
2:18PM DBG GPU: card #1 @0000:c4:00.0 -> driver: 'amdgpu' class: 'Display controller' vendor: 'Advanced Micro Devices, Inc. [AMD/ATI]' product: 'unknown'
Yes, i'd like to see this, too. I think ROCm 7.something is supposed to finally support these CPUs. 7.0 however doesn't yet. So there's a chance that as the ROCm versions used are upgraded, support "magically" shows up 🤞
I've been trying to build this with Strix Halo support for days, and no luck. With either vulkan or hipblas (I did add the gfx1151 arch to the Makefile), on ubuntu 24.04 with ROCm 7.1 (not relevant for vulkan). It builds, but llama.cpp just doesn't see the APU, tries to load models into VRAM, and of course fails.
Frustrating because I'm able to build the vanilla llama.cpp with Strix Halo support in an identical image, and it works without a hitch (both vulkan and hipblas). But the llama.cpp build that Local-AI implements, with the integrated GRPC server, fails to use vulkan.
I'll keep hammering at it and update on progress.
I've been trying to build this with Strix Halo support for days, and no luck. With either
vulkanorhipblas(I did add thegfx1151arch to the Makefile), on ubuntu 24.04 with ROCm 7.1 (not relevant for vulkan). It builds, butllama.cppjust doesn't see the APU, tries to load models into VRAM, and of course fails.Frustrating because I'm able to build the vanilla
llama.cppwith Strix Halo support in an identical image, and it works without a hitch (both vulkan and hipblas). But thellama.cppbuild that Local-AI implements, with the integrated GRPC server, fails to use vulkan.I'll keep hammering at it and update on progress.
Can confirm this outcome as well. No matter how I build the grpc server it will not detect the APU. No issues with straight llama-cpp on rocm 7.x