llama icon indicating copy to clipboard operation
llama copied to clipboard

Inference code for Llama models

Results 412 llama issues
Sort by recently updated
recently updated
newest added

Is there a way to interrupt a generation programatically? (not pressing ctrl + c)

Here is a brief description of some ways to turn this into a simple question answering chatbot. Tested on 7B model. (This is also a good way to benchmark the...

:Mandatory Loss Reference: Hi, I can't find the training loop or objective function in the code base. Have I missed it, or is it... lost? 😳 The paper does mention...

For those not able to get the model working yet. I thought I'd share some fun replies with the classic stand-up-comedy opening line "The problem with men is". Apparently our...

Bit of a dumb question probably, but what is the best way to make it continue for, say, another 256 tokens? Say your prompt is 30 tokens. And your output...

The current implementation of download.sh does not check whether a particular shard of the weight has already been downloaded and re-downloads them anyway, wasting time and internet. I have updated...

CLA Signed

Trying to run the 65B model on a vast.ai machine - though facing error - can anyone help me, by telling what could be goind wrong. Error log - ```...

Changed wget to curl. Set -e to close if any downloads fail. -f with curl to close if downloads fail.

CLA Signed

I can running `llama` but this is `cuda is out of memory`, who can run on 7B model on `windows11` with `RTX3080ti `? other projects don't seem to have windows...