Chi Kim

Results 161 comments of Chi Kim

Here are 3 different commands and their outputs: 1. without --model nor --model-dir 2. with --model 3. with --model-dir. !python server.py --notebook --chat --share --wbits 4 --model_type LLaMA --groupsize 128...

Also, here are output for: 1. --model llama7b 2. --model llama7b-4bit-128g-4bit.pt !python server.py --notebook --chat --share --wbits 4 --model_type LLaMA --groupsize 128 --model llama7b ``` 2023-04-06 12:22:11.523412: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT...

I tried both commands, and I got the following output. The first command gave me the same error, but the second one gave me an error that I haven't seen...

Can you guys mark Llava 1.6 as partial support? It's not fully supported in Llama.cpp. People assume it's the same as Llava 1.6, and it's not there yet. https://github.com/ggerganov/llama.cpp/pull/5267 The...

https://github.com/ggerganov/llama.cpp/pull/5267 only applies to llava-cli which Ollama doesn't use. The fix for server is in https://github.com/ggerganov/llama.cpp/pull/5553, and it's merged. I'm not sure if Ollama uses the update that includes the...

A couple of weeks ago, I opened #2795 to request to incorporate https://github.com/ggerganov/llama.cpp/pull/5553, but it looks like it hasn't been done yet. At least there's no further activity on that...

https://github.com/ggerganov/llama.cpp/pull/5267 alone doesn't work for Ollama. Ollama also needs https://github.com/ggerganov/llama.cpp/pull/5553 and https://github.com/ggerganov/llama.cpp/pull/5896 for everything to work properly.

Thanks @dosu. How do you use Ollama as llm for QASummaryQueryEngineBuilder? It wants to use OpenAI instead even though I specified llm parameter. ```python llm = Ollama(model="llama3", request_timeout=600, base_url=host, additional_kwargs=options)...

Thanks, that's extremely helpful! What about DocumentSummaryIndex? How would I make my own loop and create the index? response_synthesizer = get_response_synthesizer(response_mode="tree_summarize", use_async=True) index = DocumentSummaryIndex.from_documents(documents, response_synthesizer=response_synthesizer) Thanks so much!