Aaron Pham

Results 403 comments of Aaron Pham

This has been addressed by #3358 and will be included in the coming release.

This has now been included in 1.0.13 release

Sorry for the late reponse, I have been caught up with work. I forgot to specify the sphinx requirements. Should I make a `docs-requirements.txt`?

I'm currently disabling falcon on MPS since I would just run out of memory to try even run the model on Mac

Not sure if this is valid any more. I have since tested a lot with pytorch on MPS, and it is often slower. Will probably investigate mlc vs. gguf for...

can you send `pip list | grep tensorflow`?

falcon requires a lot of resource to run, even during inference. This has to do with the model having to compute all of the matrices through the attention layer. On...

I was only able to run Falcon on g5.24xlarge, which has 96GB GPU mem, 384GB ram :)