Evgeny Igumnov
Evgeny Igumnov
I have: 4x RTX 3080 = 40GB total memory (each GPU by 10 GB memory) I try to load model Mistral 7 about 15Gb file. But I take error: ```...
``` C:\Users\igumn\candle\candle-examples\examples\mistral>cargo run --example mistral --features accelerate --release -- --prompt "Here is a sample quick sort implementation in rust " --quantized -n 400 Compiling cc v1.0.90 Compiling serde v1.0.197 Compiling...
Hello Sir and Madam, Do you plan to add the gemma2:2b example? This model is very small and smart. Best regards, Evgeny
``` C:\Users\igumn\candle\candle-examples\examples\quantized>cargo run --features=cuda --example quantized --release -- --model=gemma-2-2b-it.q4_k_m.gguf --prompt "def fibonacci(n): " Finished `release` profile [optimized] target(s) in 0.48s Running `C:\Users\igumn\candle\target\release\examples\quantized.exe --model=gemma-2-2b-it.q4_k_m.gguf --prompt "def fibonacci(n): "` avx: true, neon:...
**Description:** We need to integrate the Gemini API into our application. Google Gemini API offers free usage with rate limits, making it an attractive option for enhancing our service's capabilities....