gpu_poor
gpu_poor copied to clipboard
Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization
Hi, thanks for your great work to calculate Tokens/s. I read your code of App.js and found some magic numbers. Can you please add comments for them? Just list out...
I like this, great work. I saw on your page that you mention the code is open source, but I could not find a license (such as MIT or BSD3,...
Different batch size doesn't seem to affect GPU memory usage when set in INFERENCE MODE? This doesn't seem to make sense. Is that normal?
Hi! I want to add some GPU specs to gpu_configs.json. What is the meaning of compute in that file? Is it the TFLOPS under certain precision?
Hi, great work! I would like to use it in a terminal environment so I am wondering if you can release the API or add a terminal interaction function. Thanks!
Can you add the A100 gpus? The A100 GPUs are available with 40GB and 80GB of VRAM. Thanks for the good website.