mushitori

Results 2 comments of mushitori

> > Most of the potential performance is not yet available. T3 inference uses the 0.5B Llama 3 via _transformers_ which is where the performance issue lies. It has a...

Wow these are super amazing results. Kudos to @rsxdalv for this. I have been looking for this too. My current best is 36-38 on google collab with t4 gpu enabled....