Igor Schlumberger comments

Results 125 comments of


                                            Igor Schlumberger

trafficstars

pull models by ID for reproducible research

When you use a LLM for an experiment, you can do it with docker and keep both the current Ollama version and the LLM version to reproduce the experiment. Be...

High RAM usage causes yo-yoing memory pressure on Mac, slow inference

Hi @joliss I have a Mac Station M1 with 192 GB of RAM and can launch the ollama run mixtral:8x22b-instruct-v0.1-q4_0 'Hi!' command without any trouble. I'm using version 0.1.36 of...

llava can't run

Hi @Elminsst is working on MacOS. Can you try with version 0.1.37?

Would it be possible for Ollama to support re-rank models?

@lyfuci Ollama only support OpenSource models. Before adding a model to Ollama, it is often available on Hunging Face.

Pull multiple chunks in parallel

Hi @frankhart2018 I can do that on MacOS, I open two Terminal Windows and type ollama pull llama3 on the first one and ollama pull llama2 on the second one....

Pull multiple chunks in parallel

@frankhart2018 do you want to increase or decrease this number of threads? For what goal?

Pulled SQLCoder2 even though it's not listed in the library

Hi @lestan SQLcoder2 seams to be a valid model. It's bigger than SQLCoder (9 GB instead of 4.1). It could be a copy of sqlcoder:15b what has the same size....

Bug: --json mode going into a infinite loop?

For me it's working without specifying json in the prompt. (I have 32Gb on M1Pro MacBook) (base) igor@macIgor ~ % ollama run llama2 --format json >>> give me 10 emojis...

conflicts with autocomplete and different number formats

I have the same issue with autocomplete on MacOS (Safari). Mac try to past the +33 666600000 phone number, but instead of recognizing the +33 aera code, the +66 Area...

multiple models at once

I purchased a 192 GB RAM MacStation to run Ollama with various LLMs. I concur that it would be advantageous to keep them resident in memory if possible.