lmql
lmql copied to clipboard
Difficulty getting started!
LMQL looks very promising (having played w/ Guidance) so I want to make this work but having issues from get go, trying to run it locally. I'm really hoping I can get some help.
IMMEDIATE GOAL: what is the simplest way to make this work?
Context: I have several gguf models in my comp that I want to run in my Macbook pro (pre-M, intel), basically via CPU which I ran previously many times in python code, though slow.
I want to: 1.run model directly in python code 2a.run model by exposing via api, like localhost:8081 2b.(can't in my mac but can in pc) run gguf via LM Studio and expose ip:port in PC and have python code in mac tap into it
Code:
import lmql
model_path = "/Users/mchung/Desktop/proj-ai/models/"
# model = "wizardcoder-python-13b-v1.0.Q4_K_S.gguf"
model = "codeqwen-1_5-7b-chat-q8_0.gguf"
# model = "mistral-7b-instruct-v0.2.Q5_K_M.gguf"
m = f"local:llama.cpp:{model_path+model}"
print(m)
@lmql.query(model=lmql.model(m, verbose=True))
def query_function():
'''lmql
"""A great good dad joke. A indicates the punchline
Q:[JOKE]
A:[PUNCHLINE]""" where STOPS_AT(JOKE, "?") and \
STOPS_AT(PUNCHLINE, "\n")
'''
return "What's the best way to learn Python?"
response = query_function()
print(response)
Thanks in advance.
for now, getting this error: raise TokenizerNotAvailableError("Failed to locate a suitable tokenizer implementation for '{}' (Make sure your current environment provides a tokenizer backend like 'transformers', 'tiktoken' or 'llama.cpp' for this model)".format(model_identifier)) lmql.runtime.tokenizer.TokenizerNotAvailableError: Failed to locate a suitable tokenizer implementation for 'huggyllama/llama-7b' (Make sure your current environment provides a tokenizer backend like 'transformers', 'tiktoken' or 'llama.cpp' for this model)
Hello @mcchung52 I had similar issues. Please check this issue: #350
Hi, I had the same error when trying to run a model using llama-cpp loader. It is not really clear when reading the documentation but to run model using llama-cpp you have to install the package lmql[hf] (instead of lmql) and llama-cpp-python to provide the inference backend.
Does it help ?