graphrag Which LLM models are supported？

Whether other LLM models are supported, such as ChatGLM and QWEN？

Jul 03 '24 09:07 yyyhainan

i got with same question ....... where can i see the main source code ....

Jul 03 '24 10:07 Lbaiall

+1

Jul 03 '24 11:07 dinobot22

+1

Jul 03 '24 11:07 andysingal

And also can we use locally deployed LLMs other than via api keys?

Jul 04 '24 07:07 young169

same question

Jul 04 '24 08:07 zzk2021

+1

Jul 04 '24 12:07 gallypette

Hi! During our research we got the most quality out of gpt-4, gpt-4-turbo and gpt-4o, that's why out of the box we include support for these in both OpenAI and Azure environments.

Regarding local hosting there's a very interesting conversation going on in this thread #339

Jul 04 '24 21:07 AlonsoGuevara

I have tested gemma2 and llama3 with success. The only thing that does not work locally is the embeddings. There need to be a fix to accept the style of response coming from ollama when quering embeddings... Once that is fixed you will be able to run this 100% local on a personal computer... but probably need a NVidia with 24GB of VRAM like a 3090 or a Mx Mac with 32GB RAM.

Jul 04 '24 23:07 bmaltais

I have tested gemma2 and llama3 with success. The only thing that does not work locally is the embeddings. There need to be a fix to accept the style of response coming from ollama when quering embeddings... Once that is fixed you will be able to run this 100% local on a personal computer... but probably need a NVidia with 24GB of VRAM like a 3090 or a Mx Mac with 32GB RAM.

can we use local embedding?

Jul 05 '24 04:07 zzk2021

I have tested gemma2 and llama3 with success. The only thing that does not work locally is the embeddings. There need to be a fix to accept the style of response coming from ollama when quering embeddings... Once that is fixed you will be able to run this 100% local on a personal computer... but probably need a NVidia with 24GB of VRAM like a 3090 or a Mx Mac with 32GB RAM.

Can you help me with running llama 3 from the local please..

Jul 05 '24 17:07 vamshi-rvk

I have tested gemma2 and llama3 with success. The only thing that does not work locally is the embeddings. There need to be a fix to accept the style of response coming from ollama when quering embeddings... Once that is fixed you will be able to run this 100% local on a personal computer... but probably need a NVidia with 24GB of VRAM like a 3090 or a Mx Mac with 32GB RAM.

Can you help me with running llama 3 from the local please..

Here's my .env file, put it under ./ragtest dir, hope this can help you: ''' GRAPHRAG_LLM_API_KEY=DEFAULTS GRAPHRAG_LLM_TYPE=openai_chat GRAPHRAG_LLM_API_BASE=http://127.0.0.1:5081/v1 GRAPHRAG_LLM_MODEL=Hermes-2-Pro-Llama-3-Instruct-Merged-DPO GRAPHRAG_LLM_REQUEST_TIMEOUT=700 GRAPHRAG_LLM_MODEL_SUPPORTS_JSON=True GRAPHRAG_LLM_THREAD_COUNT=16 GRAPHRAG_LLM_CONCURRENT_REQUESTS=16 GRAPHRAG_EMBEDDING_TYPE=openai_embedding GRAPHRAG_EMBEDDING_API_BASE=http://127.0.0.1:9997/v1 GRAPHRAG_EMBEDDING_MODEL=bce-embedding-base_v1 GRAPHRAG_EMBEDDING_BATCH_SIZE=64 GRAPHRAG_EMBEDDING_BATCH_MAX_TOKENS=512 GRAPHRAG_EMBEDDING_THREAD_COUNT=16 GRAPHRAG_EMBEDDING_CONCURRENT_REQUESTS=16 GRAPHRAG_INPUT_FILE_PATTERN=".*.txt$" '''