GNU Support

Results 69 comments of GNU Support

``` (TTS) lco@lco2:~/Programming/LLM/DeepSeek$ python Janus-Pro-1B.py Traceback (most recent call last): File "/home/data1/protected/TTS/lib/python3.11/site-packages/transformers/models/auto/configuration_auto.py", line 1073, in from_pretrained config_class = CONFIG_MAPPING[config_dict["model_type"]] ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/data1/protected/TTS/lib/python3.11/site-packages/transformers/models/auto/configuration_auto.py", line 775, in __getitem__ raise KeyError(key) KeyError: 'multi_modality'...

I would like to get any kind of loaded serving on endpoint, what is the way to serve it on endpoint?

``` lco@rtx:~/Programming/git/fast-llama$ bash ./build.sh g++ -o ./main ./src/utils/console.cpp ./src/utils/utility.cpp ./src/utils/ftdebug.cpp ./src/platforms/arch/x86_simd.cpp ./src/platforms/arch/arm_simd.cpp ./src/main.cpp ./src/blas/tf_operators.cpp ./src/blas/quant_operators.cpp ./src/transformer/tokenizer.cpp ./src/transformer/sampler.cpp ./src/transformer/transformer.cpp ./src/model_loaders/llama2c_loader.cpp ./src/model_loaders/flm_loader.cpp ./src/model_loaders/gguf_loader.cpp ./src/model_loaders/model_loader.cpp ./src/components/tensor.cpp -std=c++20 -mavx2 -D_GNU_SOURCE -DDISABLE_NUMA -Wall -lpthread -lm...

[Uploading speech.webm…]() I think I have got the same issue. I can heard weird sounding too long files, but at first it was working well.

I see these errors when running with `--cuda`: ``` 2024-12-21 22:03:25.924069708 [W:onnxruntime:, transformer_memcpy.cc:74 ApplyImpl] 28 Memcpy nodes are added to the graph main_graph for CUDAExecutionProvider. It might have negative impact...

Thanks. ### For my configuration here below > CPU: Intel(R) Core(TM) i5-4430S (4) @ 2.70 GHz > GPU: NVIDIA GeForce GTX 1050 Ti [Discrete] I confirm that `piper` now uses...

I cannot install TTS by using venv and on Bookworm Debian, it fails.

I think that IBM fine-tuned Mistral model with Granite template, so it works ~good~ little better with ```sh llama-server --chat-template granite ``` but maybe llama.cpp should recognize it?

As well, I would like to disable thinking in Qwen3