llama-dfdx icon indicating copy to clipboard operation
llama-dfdx copied to clipboard

LLaMa 7b with CUDA acceleration implemented in rust. Minimal GPU memory needed!

Results 5 llama-dfdx issues
Sort by recently updated
recently updated
newest added

``` ubuntu@instance-20230508-1136:~/repos/llama-dfdx$ ./target/release/llama-dfdx --model llama-7b-hf --disable-cache generate "Why is pi round?" Detected model folder as LLaMa 7b. Model size: 13476 MB 13476 MB of model parameters will be held in...

Blocking questions: 1. Is it safe to std::mem::forget the tensor? Is that all we need to do? What about the other fields of tensor? 2. Is the Vec::from_raw_parts_mut usage safe?

- llama is generation, so can't really be used with chat - vicuna is a chatbot - alpaca is instruction model If not able to determine "mode" a user could...

Alpaca 7b should be the exact same structure, so as long as you can convert the weights into the same format with `convert.py` it should be runnable out of the...

Use cases: 1. You can fit the whole model into GPU ram 2. You can fit part of the model into GPU ram 3. You need keep all the model...