dspy icon indicating copy to clipboard operation
dspy copied to clipboard

how to connect to llama-2-7b.ggmlv3.q4_K_S.bin

Open deepuak opened this issue 1 year ago • 4 comments

Hi, I am trying to connect llama-2-7b.ggmlv3.q4_K_S.bin from DSPy code and i am not able to find right way to do this. Can someone please explain or guide me to a document on do this? Btw, i am able to connect using OpenAI.

Regards, Deepak.

deepuak avatar Feb 23 '24 15:02 deepuak

You might find it helpful to check out the vLLM or Ollama community for this kind of assistance. I'd suggest giving Ollama a try. It works really well with dspy, just like this:

# Here are a few models you could use:
# 'codellama-70b-code-q4_k_m-16k:latest'
# 'mistral:7b-instruct-v0.2-q4_K_M'
# 'deepseek-coder-33b-instruct-q4_k_m-16k:latest'
# 'sqlcoder:70b-alpha-q4_K_M'
# 'qwen1_5-7b-chat-q5_k_m:latest'
# '8k-gemma-7b-instruct-q4_k_m:latest'
# 'fixed_qwen1_5-14b-chat-q5_k_m:latest'

backend_llm = dspy.OllamaLocal(
    base_url='http://10.7.0.13:11434',  # This is where the Ollama server is located.
    model='fixed_qwen1_5-14b-chat-q5_k_m:latest',  # The model we're going to use.
    num_ctx=16000,  # How much context the model can consider.
    temperature=0.6,  # This adjusts how creative the model's responses are.
    timeout_s=180,  # How long we wait for a response before giving up.
    max_tokens=2048,  # The maximum length of the model's responses.
    stop=['---','Explanation:','<|im_start|>','<|im_end|>'] # Characters or phrases that signal the model to stop generating text.
)

lzjever avatar Feb 23 '24 17:02 lzjever

thank you very much! I am able to run with OllamaLocal. However, i am facing a TimeoutError: timed out. Here is the detailed error: urllib3.exceptions.ReadTimeoutError: HTTPConnectionPool(host='localhost', port=11434): Read timed out. (read timeout=180) I am using llama2 and i run the model from windows power-shell using ollama run llama2 Is there a configuration settings that i need to do or missing? Please note that this error happens if i have a csv file and i am passing each content of csv file one by one. If i am running only with a single article passed as query to dspy, it works.

deepuak avatar Feb 26 '24 07:02 deepuak

how to use llama model with dspy as opensource? please let me know. @lzjever

anushka192001 avatar Apr 27 '24 19:04 anushka192001

I'm in the same error, but you can try work with this:

`

  backend_llm = dspy.OpenAI( 
       api_base='http://10.7.0.13:11434/v1/',       
       api_key='ollama', 
       model='fixed_qwen1_5-14b-chat-q5_k_m:latest', 
       stop='\n\n', 
       model_type='chat'
 )

`

ArgiSanchez avatar May 10 '24 18:05 ArgiSanchez