DanielWe2
DanielWe2
Interesting. Auto translating using OpenAPI?
@johnsmith0031 You mean for inference? Than you can also just use the export hf scripts from this repo and then quantize it with GPTQ for llama and for example then...
@xieydd for GPTQ llama in 4bit it would be about 5 GB (7b), 8,4 GB (8 is not enough) for 13b and 20,5 for 30b.
> add more Spanish language examples to the model. I was wondering if anyone else has a similar idea and what tools they are using to accomplish this. Someone else...
Thanks. I have played around to test its German capabilities. It's better than the 7b version that is great. Not sure if that is caused by the bigger dataset with...
Ok, is it possible that there is a commit missing in the app. Py? Can not find the place where it adds the summarize and continue buttons.
> But do you think this is what chatgpt does? Hard code previous Q/A in new input? > > Because otherwise how can it remember context, for each different chat...
> so similar approach to what LangChain does because it is very straight forward LangChain can also store it in the database (as summery or as knowledge graph) that is...
For small inputs the 4bit GPTQ version is faster. But for bigger contexts there is some kind of quadratic algorithm in the stack that needs to be optimized...
Nice. Seems really good. The best alpaca I tried till now. It seems the 30b model is the way to to go. Not easy to run ok consumer hardware though.