rainred
rainred
### Motivation Currently tokenizer_manager only support init tokenizer by transformers utils. But many models are trained with different toeknizer model (e.g minicpm, tiktoken). On another hand, GenerateReqInput supported 'input_ids' which...
https://github.com/ggerganov/llama.cpp/blame/1d1ccce67613674c75c9c7e3fa4c1e24e428ba48/examples/llava/clip.cpp#L1630 core dump happened in bicubic_resize. dbg core_file reports like ``` Program terminated with signal SIGSEGV, Segmentation fault. #0 0x000055e68efaa5be in bicubic_resize (img=..., dst=..., target_width=target_width@entry=364, target_height=target_height@entry=546) at /usr/include/c++/11/bits/stl_vector.h:1061 1061 operator[](size_type...
Installing collected packages: onnx2torch Successfully installed onnx2torch-1.5.15 model_path='silero_vad.onnx' import onnx from onnx2torch import convert def export_model(model_path): onnx_model = onnx.load(model_path) pytorch_model = convert(onnx_model conversion failed due the the operation 'If' not...
I'm also trying to use vLLM to support guided decoding of my request. currently it's default guided backend set to xgrammar (which I thought having better performance). however, in my...
as the report said it's more efficient than llama.cpp+grammar, I want to write some benchmark in c++ to compare, but couldn't find any examples about how to use xgrammar's c...
cat image_prompt.txt \../tests/data/cat.png\描述图片内容 ./build/llm_demo ~/Downloads/models/MiniCPM-V-4-MNN/config.json ./image_prompt.txt The device supports: i8sdot:1, fp16:1, i8mm: 1, sve2: 0, sme2: 1 config path is /Users/lzhang/Downloads/models/MiniCPM-V-4-MNN/config.json main, 258, cost time: 2333.435059 ms Prepare for tuning...