Messy content when using Simple Chinese with Qwen2.5-0.5b
When I input '你是谁?' will error 'invalid parameter'
tagged_addr_ctrl: 0000000000000001 (PR_TAGGED_ADDR_ENABLE) signal 6 (SIGABRT), code -1 (SI_QUEUE), fault addr -------- Abort message: 'terminating with uncaught exception of type std::invalid_argument: invalid character' x0 0000000000000000 x1 0000000000004145 x2 0000000000000006 x3 00000078e0e7d350 x4 736f646277641f73 x5 736f646277641f73 x6 736f646277641f73 x7 7f7f7f7f7f7f7f7f x8 00000000000000f0 x9 0000007a1d6a1de8 x10 0000000000000001 x11 0000007a1d71dbf0 x12 0000007a2b0f5020 x13 000000007fffffff x14 00000000011286c8 x15 00000dbc7419ca04 x16 0000007a1d78c9f8 x17 0000007a1d766500 x18 00000078ccb06000 x19 00000000000040e2 x20 0000000000004145 x21 00000000ffffffff x22 00000078e0e7d480 x23 00000078e0e7d4c0 x24 00000078e0e7d5a0 x25 00000078e0e7d4e8 x26 7ffffffffffffffc x27 3fffffffffffffff x28 b4000078eba03850 x29 00000078e0e7d3d0 lr 0000007a1d70e238 sp 00000078e0e7d330 pc 0000007a1d70e264 pst 0000000000001000 14 total frames backtrace: #00 pc 0000000000094264 /apex/com.android.runtime/lib64/bionic/libc.so (abort+164) (BuildId: 84a42637b3a421b801818f5793418fca) #01 pc 0000000000096ec0 /data/app/~~piJYmDgAu_UwLO6JK-2djA==/me.cw2.cw2bit-4tarhaK3ny9LyOBnpcbCGA==/base.apk!libggml-base.so (offset 0x993000) (BuildId: 94690fa40272afe21601839ff3d3bb7c30b0c2ab) #02 pc 00000000000970ec /data/app/~~piJYmDgAu_UwLO6JK-2djA==/me.cw2.cw2bit-4tarhaK3ny9LyOBnpcbCGA==/base.apk!libggml-base.so (offset 0x993000) (BuildId: 94690fa40272afe21601839ff3d3bb7c30b0c2ab) #03 pc 0000000000096f78 /data/app/~~piJYmDgAu_UwLO6JK-2djA==/me.cw2.cw2bit-4tarhaK3ny9LyOBnpcbCGA==/base.apk!libggml-base.so (offset 0x993000) (BuildId: 94690fa40272afe21601839ff3d3bb7c30b0c2ab) #04 pc 0000000000096518 /data/app/~~piJYmDgAu_UwLO6JK-2djA==/me.cw2.cw2bit-4tarhaK3ny9LyOBnpcbCGA==/base.apk!libggml-base.so (offset 0x993000) (BuildId: 94690fa40272afe21601839ff3d3bb7c30b0c2ab) #05 pc 0000000000096470 /data/app/~~piJYmDgAu_UwLO6JK-2djA==/me.cw2.cw2bit-4tarhaK3ny9LyOBnpcbCGA==/base.apk!libggml-base.so (offset 0x993000) (__cxa_throw+124) (BuildId: 94690fa40272afe21601839ff3d3bb7c30b0c2ab) #06 pc 000000000013b0b0 /data/app/~~piJYmDgAu_UwLO6JK-2djA==/me.cw2.cw2bit-4tarhaK3ny9LyOBnpcbCGA==/base.apk!libllama.so (offset 0xa9b000) (unicode_cpt_from_utf8(std::__ndk1::basic_string<char, std::__ndk1::char_traits
, std::__ndk1::allocator > const&, unsigned long&)+548) (BuildId: 63567ff7dd323d4ef1b9a99e30ec4687744cc13f) #07 pc 000000000013b4b0 /data/app/~~piJYmDgAu_UwLO6JK-2djA==/me.cw2.cw2bit-4tarhaK3ny9LyOBnpcbCGA==/base.apk!libllama.so (offset 0xa9b000) (unicode_cpts_from_utf8(std::__ndk1::basic_string<char, std::__ndk1::char_traits , std::__ndk1::allocator > const&)+196) (BuildId: 63567ff7dd323d4ef1b9a99e30ec4687744cc13f) #08 pc 000000000013cbec /data/app/~~piJYmDgAu_UwLO6JK-2djA==/me.cw2.cw2bit-4tarhaK3ny9LyOBnpcbCGA==/base.apk!libllama.so (offset 0xa9b000) (unicode_regex_split(std::__ndk1::basic_string<char, std::__ndk1::char_traits , std::__ndk1::allocator > const&, std::__ndk1::vector<std::__ndk1::basic_string<char, std::__ndk1::char_traits , std::__ndk1::allocator >, std::__ndk1::allocator<std::__ndk1::basic_string<char, std::__ndk1::char_traits , std::__ndk1::allocator > > > const&)+496) (BuildId: 63567ff7dd323d4ef1b9a99e30ec4687744cc13f) #09 pc 000000000011d220 /data/app/~~piJYmDgAu_UwLO6JK-2djA==/me.cw2.cw2bit-4tarhaK3ny9LyOBnpcbCGA==/base.apk!libllama.so (offset 0xa9b000) (llm_tokenizer_bpe_session::tokenize(std::__ndk1::basic_string<char, std::__ndk1::char_traits , std::__ndk1::allocator > const&, std::__ndk1::vector<int, std::__ndk1::allocator >&)+76) (BuildId: 63567ff7dd323d4ef1b9a99e30ec4687744cc13f) #10 pc 000000000011bdf8 /data/app/~~piJYmDgAu_UwLO6JK-2djA==/me.cw2.cw2bit-4tarhaK3ny9LyOBnpcbCGA==/base.apk!libllama.so (offset 0xa9b000) (llama_tokenize_internal(llama_vocab const&, std::__ndk1::basic_string<char, std::__ndk1::char_traits , std::__ndk1::allocator >, bool, bool)+1504) (BuildId: 63567ff7dd323d4ef1b9a99e30ec4687744cc13f) #11 pc 000000000011f974 /data/app/~~piJYmDgAu_UwLO6JK-2djA==/me.cw2.cw2bit-4tarhaK3ny9LyOBnpcbCGA==/base.apk!libllama.so (offset 0xa9b000) (llama_tokenize_impl(llama_vocab const&, char const*, int, int*, int, bool, bool)+184) (BuildId: 63567ff7dd323d4ef1b9a99e30ec4687744cc13f) #12 pc 00000000000bb8f4 /data/app/~~piJYmDgAu_UwLO6JK-2djA==/me.cw2.cw2bit-4tarhaK3ny9LyOBnpcbCGA==/base.apk!libllama.so (offset 0xa9b000) (llama_tokenize+16) (BuildId: 63567ff7dd323d4ef1b9a99e30ec4687744cc13f) #13 pc 0000000000008264 [anon:dart-code]
When I input '2*2=?' will output some messy content, example: ï¼ 3 ï¼
Now, I try to find the root of the problem
i'm use this version llama.cpp compiler: https://github.com/ggerganov/llama.cpp/releases/tag/b4138 . and use llama_cpp_dart ^0.0.8
in Flutter, I use this way to chat, I think Qwen can understand I say, and I think it output is Japanese, but output I see is error, whether is Flutter Text component or Android Studio Console.
final filePath = await FilePicker.platform.pickFiles(type: FileType.any);
if (filePath == null) return;
final file = filePath.paths.single;
final loadCommand = LlamaLoad(
path: '${file}',
modelParams: ModelParams(),
contextParams: ContextParams(),
samplingParams: SamplerParams(),
format: ChatMLFormat(),
);
final llamaParent = LlamaParent(loadCommand);
await llamaParent.init();
llamaParent.stream.listen((response) {
print('${response}');
llama_content.value = llama_content.value + response;
});
llamaParent.sendPrompt("こんにちは、私に日本語で返信してください");
Output:
I/flutter (10106): &ZoneGuarded&print: ãã
I/flutter (10106): &ZoneGuarded&print: ã
I/flutter (10106): &ZoneGuarded&print: å
I/flutter (10106): &ZoneGuarded&print: 人ã®
I/flutter (10106): &ZoneGuarded&print: ãã¨ã
I/flutter (10106): &ZoneGuarded&print: 大å
I/flutter (10106): &ZoneGuarded&print: ã«
I/flutter (10106): &ZoneGuarded&print: ããã
I/flutter (10106): &ZoneGuarded&print: ã¨ãã
I/flutter (10106): &ZoneGuarded&print: æ³
I/flutter (10106): &ZoneGuarded&print: ã
I/flutter (10106): &ZoneGuarded&print: ãããã¾ãã
I/flutter (10106): &ZoneGuarded&print: ã
I/flutter (10106): &ZoneGuarded&print: æè¿
I/flutter (10106): &ZoneGuarded&print: ã®
I/flutter (10106): &ZoneGuarded&print: å
I/flutter (10106): &ZoneGuarded&print: 人ã®
I/flutter (10106): &ZoneGuarded&print: æ§
I/flutter (10106): &ZoneGuarded&print: ç¸
I/flutter (10106): &ZoneGuarded&print: ã
I/flutter (10106): &ZoneGuarded&print: ã
I/flutter (10106): &ZoneGuarded&print: é常ã«
I/flutter (10106): &ZoneGuarded&print: è¯ã
I/flutter (10106): &ZoneGuarded&print: ã
I/flutter (10106): &ZoneGuarded&print: ã
I/flutter (10106): &ZoneGuarded&print: å¿
I/flutter (10106): &ZoneGuarded&print: ãã
I/flutter (10106): &ZoneGuarded&print: ã
I/flutter (10106): &ZoneGuarded&print: 主人
I/flutter (10106): &ZoneGuarded&print: ã«
I/flutter (10106): &ZoneGuarded&print: ç³ã
I/flutter (10106): &ZoneGuarded&print: è¨
Resolved some error:
- When I input '你是谁?' will error 'invalid parameter':? cannot be understood by llama_cpp_dart or llama.so
Maybe I find error and resolved messy response content !!
For this file llama.dart:
// 314 row
final buf = malloc<Char>(128);
try {
int n = lib.llama_token_to_piece(model, newTokenId, buf, 128, 0, true);
if (n < 0) {
throw LlamaException("Failed to convert token to piece");
}
String piece = String.fromCharCodes(buf.cast<Uint8>().asTypedList(n));
I find:
- Get UTF-8 bytes from
buf.cast<Uint8>().asTypedList(n) -
String.fromCharCodesmaybe cannot resolve UTF-8. It resolve by ISO-8859-1, and then output messy content.
So, I will use UTF-8 to resolve bytes from Buf, like this:
final buf = malloc<Char>(128);
try {
int n = lib.llama_token_to_piece(model, newTokenId, buf, 128, 0, true);
if (n < 0) {
throw LlamaException("Failed to convert token to piece");
}
// String piece = String.fromCharCodes(buf.cast<Uint8>().asTypedList(n));
String piece = '';
try {
// USE UTF-8 resolve
piece = utf8.decode(buf.cast<Uint8>().asTypedList(n));
} catch (e) {
print("Error decoding UTF-8 sequence: $e");
}
_tokenPtr.value = newTokenId;
batch = lib.llama_batch_get_one(_tokenPtr, 1);
bool isEos = newTokenId == lib.llama_token_eos(model);
return (piece, isEos);
} finally {
malloc.free(buf);
}
Finally, I find somethings is:
Qwen-0.5b might responsing 2 bytes(UTF-8) when it output to be close to expected tokens(contextParams: ContextParams()..nPredit = 32, default is 32) , but utf8.decode can not resolve it, and Dart will throw exception Error decoding UTF-8 sequence: FormatException: Unfinished UTF-8 octet sequence (at offset 2) . When I use contextParams: ContextParams()..nPredit = 64, He hardly makes mistakes, I think might is SLM's mistakes.
I had tested Simple Chinese, English, Japanese. I think other languages is OK. Hope help you.
Then I will put PR, or you can fork master and modify it yourself
If you only use English, maybe it no use for you
@Chaos-woo have you run deepseek ?
using this libraries
@sadaqatdev I haven't tried Deepseek before,I think whether it can run depends only on the CPU or GPU you are using,So you can use distilled deepseek to run smaller models.You can try it out
@sadaqatdev But you need to find the gguf model to run on HF
yes i have but unable to run , it throw error when initialising
#58 you can check this
this model @Chaos-woo https://github.com/fabiomatricardi/Deepseek-R1-qwen1.5B?tab=readme-ov-file