langflow Add support for llama.cpp

trafficstars

add support for llama.ccp for local Ai inferencing

Apr 10 '23 04:04 SumDevv

#134 added that but we haven't released it yet because I was not able to test it yet. Do you think you could test it using the dev branch?

Apr 10 '23 14:04 ogabrielluiz

Langflow stays stuck on ' thinking ' even after 5 minutes.. with the latest 0.56 build.. Also idk why it unsucessfully runs llama.cpp two times and gets stuck on the third time?

siddhesh@desktop:~/Desktop$ langflow [16:39:52] INFO [16:39:52] - INFO - Logger set up with log level: 20(info) logger.py:28 INFO [16:39:52] - INFO - Log file: logs/langflow.log logger.py:30 [2023-04-14 16:39:52 +0530] [12703] [INFO] Starting gunicorn 20.1.0 [2023-04-14 16:39:52 +0530] [12703] [INFO] Listening at: http://127.0.0.1:7860 (12703) [2023-04-14 16:39:52 +0530] [12703] [INFO] Using worker: uvicorn.workers.UvicornWorker [2023-04-14 16:39:52 +0530] [12715] [INFO] Booting worker with pid: 12715 [2023-04-14 16:39:52 +0530] [12715] [INFO] Started server process [12715] [2023-04-14 16:39:52 +0530] [12715] [INFO] Waiting for application startup. [2023-04-14 16:39:52 +0530] [12715] [INFO] Application startup complete. llama_model_load: loading model from '/home/siddhesh/Desktop/vicuna.bin' - please wait ... llama_model_load: n_vocab = 32001 llama_model_load: n_ctx = 512 llama_model_load: n_embd = 5120 llama_model_load: n_mult = 256 llama_model_load: n_head = 40 llama_model_load: n_layer = 40 llama_model_load: n_rot = 128 llama_model_load: f16 = 2 llama_model_load: n_ff = 13824 llama_model_load: n_parts = 2 llama_model_load: type = 2 llama_model_load: ggml map size = 7759.84 MB llama_model_load: ggml ctx size = 101.25 KB llama_model_load: mem required = 9807.93 MB (+ 3216.00 MB per state) llama_model_load: loading tensors from '/home/siddhesh/Desktop/vicuna.bin' llama_model_load: model size = 7759.40 MB / num tensors = 363 llama_init_from_file: kv self size = 800.00 MB AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 | llama_model_load: loading model from '/home/siddhesh/Desktop/vicuna.bin' - please wait ... llama_model_load: n_vocab = 32001 llama_model_load: n_ctx = 512 llama_model_load: n_embd = 5120 llama_model_load: n_mult = 256 llama_model_load: n_head = 40 llama_model_load: n_layer = 40 llama_model_load: n_rot = 128 llama_model_load: f16 = 2 llama_model_load: n_ff = 13824 llama_model_load: n_parts = 2 llama_model_load: type = 2 llama_model_load: ggml map size = 7759.84 MB llama_model_load: ggml ctx size = 101.25 KB llama_model_load: mem required = 9807.93 MB (+ 3216.00 MB per state) llama_model_load: loading tensors from '/home/siddhesh/Desktop/vicuna.bin' llama_model_load: model size = 7759.40 MB / num tensors = 363 llama_init_from_file: kv self size = 800.00 MB AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 | llama_model_load: loading model from '/home/siddhesh/Desktop/vicuna.bin' - please wait ... llama_model_load: n_vocab = 32001 llama_model_load: n_ctx = 512 llama_model_load: n_embd = 5120 llama_model_load: n_mult = 256 llama_model_load: n_head = 40 llama_model_load: n_layer = 40 llama_model_load: n_rot = 128 llama_model_load: f16 = 2 llama_model_load: n_ff = 13824 llama_model_load: n_parts = 2 llama_model_load: type = 2 llama_model_load: ggml map size = 7759.84 MB llama_model_load: ggml ctx size = 101.25 KB llama_model_load: mem required = 9807.93 MB (+ 3216.00 MB per state) llama_model_load: loading tensors from '/home/siddhesh/Desktop/vicuna.bin' llama_model_load: model size = 7759.40 MB / num tensors = 363 llama_init_from_file: kv self size = 800.00 MB AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 |

Apr 14 '23 11:04 lolxdmainkaisemaanlu

Mine behaves the same way, but it is not stuck, it just takes that long for it to execute for me.

Apr 14 '23 15:04 nsvrana

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

May 29 '23 19:05 stale[bot]

Which model do I use for the llamaCPP LLM? I have tried several. Where is the documentation for using langflow?

May 30 '23 01:05 TaoAthe

#233 Could you try what I mentioned in this issue? It works here. We've released a new version that might help with this.

May 30 '23 21:05 ogabrielluiz

Sorry I have been off trying to locate a GPU that is better. I will try the latest thank you for responding.

Jun 02 '23 23:06 TaoAthe

Does someone figured out how to run llama with langflow? I tried many approaches and I still am struggling, I have a model of llama-2-13b that I converted, build and quantized with llama.cpp. It's running well in llama (ggml-model-q4_0.gguf also tried ggml-vic7b-q4_0.bin). I created a models directory in root project and tried LlamaCpp and Ctransformers but I never got an response from the LLM.. Can someone please help me ?

Sep 15 '23 10:09 berradakamal

langflow langflow copied to clipboard

Add support for llama.cpp

langflow
langflow copied to clipboard