Philipp Lutz
Philipp Lutz
Now I set `cwsr_enable=0` and rebooted. Additionally I trimmed down VRAM usage to 512 MB, as recommended and increased GTT mem to 6 GiB. After a little bit of testing...
> The result from radeontop looks fine. The difference came from TTM's calculation of used / total. Platform code will also reserve some space if configured to 512M, leading to...
> I forgot to check with you on the version of MES firmware you're using. Kindly provide the output of: `cat /sys/kernel/debug/dri//amdgpu_firmware_info` > > If the version is > 0x80,...
> I can confirm it always fails (to me) with latest firmware (20251125). It does not happen with 20251111. > > Memory access fault by GPU node-1 (Agent handle: 0x5570f8146df0)...
> Looks like a memory issue happened first `svm_range_restore_work`. I believe this is paging in/out of GTT shared space into swap. This is what happens using llama-cpp with models >96MB...
> Just to confirm, with CWSR=0 **and** sleep/wake cycle, are you still reproducing? I wasn't able since my last update (new kernel, latest amdgpu firmware and CWSR=0 set). So I'd...