aropb
aropb
### Description LLamaSharp 0.23.0 CUDA Windows LLM: Qwen3-8B-Q5_K_M.gguf Errors: llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'qwen3' llama_model_load_from_file_impl: failed to load model llama.cpp: https://github.com/ggml-org/llama.cpp/pull/12828
### Description Hi, To present the results of refactoring the StatelessExecutor code. This looks strange and not optimal (the context can be many gigabytes): **Context = _weights.CreateContext(_params, logger); Context.Dispose();** public...
### Description I run several requests (3-4) at the same time, which are executed sequentially by LLamaEmbedder.GetEmbeddings() and StatelessExecutor.InferAsync(). The models for these commands are different. For Infer (one instance...
Models: https://huggingface.co/benxh/Qwen2.5-VL-7B-Instruct-GGUF https://huggingface.co/KBlueLeaf/llama3-llava-next-8b-gguf (from here https://github.com/SciSharp/LLamaSharp/discussions/897) https://huggingface.co/second-state/Llava-v1.5-7B-GGUF Error: External component has thrown an exception. System.Runtime.InteropServices.SEHException (0x80004005): External component has thrown an exception. at LLama.Native.SafeLlavaModelHandle.clip_model_load(String mmProj, Int32 verbosity) at LLama.Native.SafeLlavaModelHandle.LoadFromFile(String modelPath,...
### Context / Scenario This is where the NormalizeNewlines() call is needed. https://github.com/microsoft/kernel-memory/blob/508ac0b1236a6f4700c093909282e8c83a385c97/service/Core/DataFormats/Text/TextDecoder.cs#L46 https://github.com/microsoft/kernel-memory/blob/508ac0b1236a6f4700c093909282e8c83a385c97/service/Core/DataFormats/Text/TextDecoder.cs#L60 https://github.com/microsoft/kernel-memory/blob/508ac0b1236a6f4700c093909282e8c83a385c97/service/Core/DataFormats/Text/MarkDownDecoder.cs#L43 https://github.com/microsoft/kernel-memory/blob/508ac0b1236a6f4700c093909282e8c83a385c97/service/Core/DataFormats/Text/MarkDownDecoder.cs#L57
There is no need to remove spaces here, they are needed if they remain after the decoder. https://github.com/microsoft/kernel-memory/blob/1c424ed35738342ea2b3f8cae4091cb649071c49/service/Core/Handlers/TextExtractionHandler.cs#L220 It can be a PDF where the sentence goes to the next...
### Context / Scenario It looks like this is a Cyrillic problem. MaxTokensPerParagraph=1000 OverlappingTokens=200 I check all the text after the decoder, and everything is fine, for example (xlsx, the...
### Description Reranker error: decode: cannot decode batches with this context (calling encode() instead) D:\a\LLamaSharp\LLamaSharp\ggml\src\ggml-cpu\ops.cpp:5116: GGML_ASSERT(i01 >= 0 && i01 < ne01) failed Model: jina-reranker-v2-base-multilingual-FP16.gguf Now, when this error occurs,...
### Description Please keep in mind that embeddings for chunks are created many times for a single document. You wrote the code so that the context is constantly being recreated....
The tokens list cannot be changed, it is the input list of tokens! Error line: https://github.com/SciSharp/LLamaSharp/blob/9fe066dffa8aeb53d47bdbd9d8d6e97988055b14/LLama/LLamaStatelessExecutor.cs#L137 Llama.cpp source: https://github.com/ggml-org/llama.cpp/blob/60325fa56f61c228464c9f065db3aa6a61f2156e/examples/main/main.cpp#L334 https://github.com/ggml-org/llama.cpp/blob/60325fa56f61c228464c9f065db3aa6a61f2156e/examples/main/main.cpp#L542 I'm trying to fix and rewrite it correctly, but so...