llm icon indicating copy to clipboard operation
llm copied to clipboard

every model seems to die early

Open slithernix opened this issue 4 months ago • 0 comments

no matter what model I use, no matter the prompt, it dies in the middle of responding, same behavior with --no-stream. Example:

⮕  llm prompt --no-stream -m wiz 'write me three paragraphs about something interesting'
Sure, here are three paragraphs about the Great Barrier Reef in Australia. The Great Barrier Reef is one of the most fa
scinating natural wonders on Earth and a UNESCO World Heritage Site. It's located off the coast of Queensland, Australi
a, and it's the largest coral reef system in the world, spanning over 2,300 kilometers. The Great Barrier Reef is home 
to an incredible array of marine life, including more than 1,500 species of fish, 600 types of coral, and countless oth
er creatures like sea turtles, dolphins, whales, sharks, and rays.

The Great Barrier Reef is not just a beautiful sight to behold; it's also an important ecosystem that supports the live
lihoods of many people in Australia. It attracts millions of tourists each year who come

I've looked at timing and byte count of output, here's a couple measurements:

  time bash -c "source ~/python-environments/llm/bin/activate && llm prompt --no-stream -m wiz 'write me 3 paragraphs about something interesting' | wc --bytes"
840

real    0m41.316s
user    2m41.475s
sys     0m4.073s
[snake@einstein][][:~/python-environments/llm][][41.3s][0]
⮕  time bash -c "source ~/python-environments/llm/bin/activate && llm prompt --no-stream -m wiz 'write me 9 paragraphs about something interesting' | wc --bytes"
910

real    0m40.816s
user    2m39.069s
sys     0m4.053s
[snake@einstein][][:~/python-environments/llm][][40.8s][0]
⮕  time bash -c "source ~/python-environments/llm/bin/activate && llm prompt --no-stream -m starcoder 'write me a go CLI that demonstrates the use of goroutines and channels. make it have several arguments such as number of goroutines and some other tweaks.' | wc --bytes"
698

real    0m46.489s
user    3m3.234s
sys     0m3.836s

The same thing happens if I try 'chat' instead of 'prompt.' This is on a machine with 64GB of RAM and 12GB RAM RTX A1000. Am I doing something wrong here?

slithernix avatar Feb 13 '24 04:02 slithernix