Kunwar Raj Singh
Kunwar Raj Singh
is weight loading completely fixed after the latest commit? @titu1994
@geohot I can try adding LLM.int8() quantization into tinygrad - refering https://arxiv.org/pdf/2208.07339.pdf and https://github.com/TimDettmers/bitsandbytes
Update: I figured out the issue, it was with the way I was casting the tensors. I've got FP16 inference with stablediffusion working. Here's an output for prompt - "a...
> NOTE: all the math and intermediates for stable diffusion are still float32, changing that will require more work. But float16 weights (save memory / memory bandwidth) work. @geohot Agreed,...
@geohot What do you think about having an env var to control the default tensor type? or a singleton class like DEBUG which can be used to set it at...
@python273 Good point. Added the change, and moved tensor.realize() from load_single_weight to post_process, along with the typecasts to HALF
@python273 Added tests for loading in specific dtype. since load_single_weight is called multiple times, I replaced the t.realise() calls with post_process calls so it should not cause any issues. Also,...
@python273 Made the reviewed changes, will create a separate PR for dropout change
@marcellofuschi Hey, I think we trying to do something similar in tensor.dropout https://github.com/geohot/tinygrad/pull/864
> I started the same project today but you are head of me. Maybe you need to drop the last fc layer of the backbone right? yes, they can be...