mrgaolei
mrgaolei
Size of my SDCard is 1GB. I format it using FAT32 via Mac's terminal: ``` diskutil earseVolume FAT32 name /Volumes/SDCARD ``` But I will buy a bigger SDCard, are you...
When should I insert Nintendo Super Star Brow DISC? I haven't this DISC yet, is game list require this DISC?
Got it, I change a SD card, every thing goes right.
> Is it possible that the bottleneck is the GPU now? 3090 is quite close to V100 in term of Flops. Just a guess. However, aren't ktransformers entirely computed by...
And this is `rpc.log` ``` [2025-08-12 17:21:55.161] [info] [scheduler.cpp:31] Number of available GPUs: 1, want 1 [2025-08-12 17:21:55.161] [info] [scheduler.cpp:66] Each GPU Total: 2196MiB, Model Params: 0MiB, KVCache: 2196MiB, Left:...
再次补充:git tag切换到v0.3.2,用非blance_serve启动,就没问题,用blance_serve启动会提示这个。 如果git pull到HEAD版本,则都不行。但是我看好像HEAD版本已经默认是blance_serve引擎了。 难道说,blance_serve要求必须512G内存?我这台机器唯一的区别就是把512G内存降级成了256,但是我跑的Q2模型,理论也够的,因为kt默认引擎都可以启动。
> 首先,AMX只加速prefill过程 其次,相同配置下,BF16的生成速度要是比Q4快那你可以领图灵奖了 那么使用Q4模型,哪怕是prefill过程也不会加速是吗?
> The issue is that you used `--kt-num-gpu-experts 16`, which specifies that each layer has 16 experts on the GPU. The 24GB VRAM can’t handle that, so try lowering this...
> I see. The native Kimi-K2-Thinking model uses BF16-precision (non-expert) weights on the GPU side, so it consumes more VRAM than DeepSeek-V3/R1. A 24 GB GPU isn’t sufficient. You may...
```bash free -h ``` 看buffer/cache,主要内存占用在这里