llama2-webui
llama2-webui copied to clipboard
Test log | Welcome to communicate
# test log
Cloud platform: matpool.com
Machine used: NVIDIA A40
Model used: Llama-2-13b-chat-hf
After the model is loaded, it takes up video memory: about 26G
Inference memory usage: about 26G GPU utilization: about 80%
Memory usage: about 2G
If hhhh cannot be written, it will also be filled with characters