Molly Sophia issues

Results 9 issues of


                                            Molly Sophia

rvv int8 Convolution/ConvDw/Quantize/Requantize/Dequantize

呜呜呜

riscv

test

llama : support RWKV v6 models

This should fix #846. ## Added: ### ggml: - Added unary operation ``Exp`` - Added ``rwkv_wkv`` operation with CPU impl - Added ``rwkv_token_shift`` operation with CPU impl to handle multiple...

python

ggml

[WIP] Apple AMX optimization

qwq（？ test能过，但是速度比nihui写的慢了太菜了（

core

arm

cmake

[Bug] RWKV v6 models fail to compile with latest mlc_llm

## 🐛 Bug RWKV v6 models fail to compile with latest mlc_llm. Edit: Also it seems that there's currently only rwkv v5 compiling test in ci. Should rwkv v6 be...

bug

Update chat roles of QA mode

...which matches the chat template in https://github.com/BlinkDL/ChatRWKV/blob/main/API_DEMO_CHAT.py for RWKV v6

Add support for loading RWKV v6 GGUF files

GGUF files are converted using llama.cpp convert_hf_to_gguf.py script (https://github.com/ggerganov/llama.cpp/pull/8980)

Changes in this PR: - Added a patch on llama.cpp with commits upstream: [llama.cpp: 10433e8 ](https://github.com/ggerganov/llama.cpp/commit/10433e8b457c4cfd759cbb41fc55fc398db4a5da) and [4ff7fe1](https://github.com/ggerganov/llama.cpp/commit/4ff7fe1fb36b04ddd158b2de881c348c5f0ff5e4), [11d4705](https://github.com/ggerganov/llama.cpp/commit/11d47057a51f3d9b9231e6b57d0ca36020c0ee99). These fixes the problem that rwkv gguf model cannot be loaded,...

llama: Add support for RWKV v7 architecture

@BlinkDL 's explanation of RWKV v7: [RWKV-7 as a meta-in-context learner](https://x.com/BlinkDL_AI/status/1861753903886561649) Also there are plenty of tests on trained models (currently 0.1B and 0.4B) posted on his x account. Larger...

testing

Nvidia GPU

Vulkan

python

ggml

SYCL

Apple Metal