bloomz.cpp icon indicating copy to clipboard operation
bloomz.cpp copied to clipboard

Quantizing and running inference on bloom-176B required some changes

Open barsuna opened this issue 1 year ago • 0 comments

  • Most issues are due to fact that embedding layer 250880x14336 is too large to fit into signed integer
  • Above affects the main, quantize, and also ggml code
  • 2nd issue is that main seems to estimate amount of necessary memory on the low side
  • Above is not fixed, i have just added 5GB for weights and doubled the size of context used for model evaluation Being very far away from proficiency in C++, these changes need to be civilized by someone experienced with ggml and c++

barsuna avatar Apr 02 '23 15:04 barsuna