Use any small models, like LaMini, Flan-T5-783M etc.

Open gitknu opened this issue 2 years ago • 0 comments

Hello, sorry for the fact I couldn't find the solution in issues and if the question is dumb, but looking for the answer and trying by myself didn't give the result.

Details: The problem is I have low-end PC which is capable of running Alpaca and Vicuna (both 7B), but quite slowly. On the other hand, trying different models I saw that models under 1B parameters run quite well. Mainly they are based on Flan-T5. They give good results as for my machine and quickly enough (about 3-5 tokens per second). Using it with text is another better point. For example, asking it "basing on this text, answer -..." I have almost perfect answer. But giving it text each time is bad practice as for me. I mean, time spend etc.

Short question: Is there any way to use this tool with any of these models?

LaMini-Flan-T5-783M
Flan-T5-Alpaca (770M or something)
RWKV (under 1.5B)
(any other good small models, under 1B parameters) If you give the detailed manual I will be very grateful! Solutions, other than privateGPT etc. are also welcome!

Thank you for understanding, answers and sorry for any inconvenience!

Jul 02 '23 18:07 gitknu