Animesh Jain
Results
2
issues of
Animesh Jain
Hello. I'm trying to finetune code llama for a multifile code generation task on my private repository. The goal is to have the LLM generate code for some common bugs...
fine-tuning
Slow Inference on LLAMA 3.1 405B using ollama.generate with Large Code Snippets on multi-H100 GPUs
1
I'm experiencing very slow inference times when using the ollama.generate function on a multiple H100 GPU machine. Specifically, it is taking up to 5 minutes per inference, even though the...