gpt_academic icon indicating copy to clipboard operation
gpt_academic copied to clipboard

Local Models

Open Kerushii opened this issue 2 years ago • 2 comments

Hi, any thoughts on galactica(opt pretrained on pubmed and wiki) and llama(no domain specific knowledge trained yet) and any plan to release pretrained domain specific models for local inference? Thanks

Kerushii avatar Mar 25 '23 08:03 Kerushii

Just took a look at Galactica at Hugging Face. It seems very easy to implement, but like Visual-ChatGPT, the local package dependency behind it is very heavy. (Visual-ChatGPT models almost blow up my hard drive.)

However, it is a good idea to be able to switch between models. We'll keep looking into models such as Galactica, GPT-4, 文心一言, and LLaMA, as well as their academic usage.

binary-husky avatar Mar 25 '23 09:03 binary-husky

Just took a look at Galactica at Hugging Face. It seems very easy to implement, but like Visual-ChatGPT, the local package dependency behind it is very heavy. (Visual-ChatGPT models almost blow up my hard drive.)

However, it is a good idea to be able to switch between models. We'll keep looking into models such as Galactica, GPT-4, 文心一言, and LLaMA, as well as their academic usage.

Screenshot from 2023-03-25 02-45-08 I have a demo here using galactica 30B (feel free to try it This model is very capable though still a bit slow running on 3*p40 The hardware is still within a reasonable price range and would be fantastic if something can be made useful with such model(the ui above I made was rushed together in a few nights, nothing serious, but I would be really excited if galactica can be made "more useful" through prompt engineering or further fine tuning as in the case of llama, it's supposed to have better embeddings and attention layers compared to galactica(opt) and it's the best opensourced llm available, however, it's not trained on academic papers like galactica.

Kerushii avatar Mar 25 '23 09:03 Kerushii

image

@Kerushii Hello, I have successfully run the local galactica-1.3b on the morellm branch here, but the model keeps saying strange things continuously after answering questions. Have you encountered a similar issue?

binary-husky avatar Apr 01 '23 16:04 binary-husky