acadaiaca
acadaiaca
代码复现环境报错
复现代码,在docker容器内运行或者新建环境,都会在bert_model = load_trained_model_from_checkpoint(paths.config, paths.checkpoint, seq_len=None) 这步报错:AttributeError: 'tuple' object has no attribute 'layer' 所有依赖和requirements.txt里要求的一样,尝试安装了tensorflow_gpu=1.14和1.13,都会报同样的错误,请问是什么情况? 报错信息: `>>> from keras_bert import load_vocabulary, load_trained_model_from_checkpoint, Tokenizer, get_checkpoint_paths Using TensorFlow backend. >>> bert_model_path = '/opt/zhuiyi/tianchi_nl2sql-master/model/chinese_wwm_L-12_H-768_A-12' >>>...
### ChatGLM how to continue pre-training and fine-tuning? This is the same question in this issue: https://github.com/THUDM/ChatGLM-6B/issues/3 But after reading these information I still don't know how to use HF...
尝试设置target_modules=["query_key_value"]以及target_modules=["query_key_value", "dense", "dense_h_to_4h", "dense_4h_to_h"],微调几个或几十个epoch,模型似乎没有学到什么 请问大家怎么判断模型微调之后是否生效?或者说,模型通过微调,能学到什么? 从微调数据里抽一条完全一样的,看微调后的模型回答是否和微调数据一样? 还是说,微调之后,模型的回答风格发生改变,比如微调数据的answer很短,模型微调后更倾向于短回答?
分布式训练之后的模型格式是: zero_pp_rank_0_mp_rank_00_model_states.pt zero_pp_rank_0_mp_rank_00_optim_states.pt zero_pp_rank_1_mp_rank_00_model_states.pt zero_pp_rank_1_mp_rank_00_optim_states.pt 请问如何转化成适用于llama_inference推理的模型格式?https://github.com/fengyh3/llama_inference
TencentPretrain is a great job that provides convenience to model users! Any plan of supporting ChatGLM-6B model?
尝试增加词表后,随机初始化embedding,冻结其它层权重,并在unlabel数据上预训练embedding,但效果反而不如不增加词表。 期待你们分享训练embedding的方法和代码实现,谢谢!
Are there plans to support Falcom llm?Thx! https://huggingface.co/tiiuae/falcon-40b https://huggingface.co/tiiuae/falcon-40b-instruct
微调BLOOM,使用Lora或QLora,是否支持多卡训练?