baize-chatbot
baize-chatbot copied to clipboard
How collect other topic data?
How collect other topic data? such as "china history" topic, how to modify the code, in detail. Thanks!
Change the following lines to add your own data: https://github.com/project-baize/baize-chatbot/blob/6790946f638d60fcaf397574189124f15792f35a/collect.py#L17-L41
@JetRunner Thanks for your reply! I have trained follow your instructions,
python finetune.py 7b 16 0.0002 quora
and I get some files:
checkpoints/
└── 7b
├── adapter_config.json
├── adapter_model.bin
├── checkpoint-200
│ ├── optimizer.pt
│ ├── pytorch_model.bin
│ ├── rng_state.pth
│ ├── scaler.pt
│ ├── scheduler.pt
│ ├── trainer_state.json
│ └── training_args.bin
├── checkpoint-400
……
Now, how to load local models, when I run demo/app.py?
You need to cp checkpoints/7b/checkpoint-200/pytorch_model.bin checkpoints/7b/adapter_model.bin and the lora_model path set as ../checkpoints/7b
@guoday Thanks for you reply!
python app.py decapoda-research/llama-7b-hf project-baize/baize-lora-7B
change to
python app.py decapoda-research/llama-7b-hf ../checkpoints/7b/
Is that so?
yes.
@guoday @JetRunner If I want to collect other topic such as "中国历史", the questions list in code need to edit myself? It's too hard to create tens of thousands of questions artificially😭.
https://github.com/project-baize/baize-chatbot/blob/6790946f638d60fcaf397574189124f15792f35a/collect.py#L57-L58
You don't have to collect questions. Some entities like "唐朝" or "李白" also work well.