Aaron Pham
Aaron Pham
This has been addressed by #3358 and will be included in the coming release.
This has now been included in 1.0.13 release
Sorry for the late reponse, I have been caught up with work. I forgot to specify the sphinx requirements. Should I make a `docs-requirements.txt`?
I'm currently disabling falcon on MPS since I would just run out of memory to try even run the model on Mac
Not sure if this is valid any more. I have since tested a lot with pytorch on MPS, and it is often slower. Will probably investigate mlc vs. gguf for...
can you send `pip list | grep tensorflow`?
hmm do you have keras installed?
falcon requires a lot of resource to run, even during inference. This has to do with the model having to compute all of the matrices through the attention layer. On...
Got it, i will take a look
I was only able to run Falcon on g5.24xlarge, which has 96GB GPU mem, 384GB ram :)