P-tuning
P-tuning copied to clipboard
A novel method to tune language models. Codes and datasets for paper ``GPT understands, too''.
can you renew the connection of lama?
最近拜读了您的论文《GPT Understands, Too》,关于这段话有些不理解,希望您能帮忙指导解释下:”1) Discreteness: the original word embedding e of M has already become highly discrete after pre-training. If h is initialized with random distribution and then optimized with stochastic...
File "/anaconda3/envs/python37/lib/python3.7/contextlib.py", line 112, in __enter__ return next(self.gen) Fileg/anaconda3/envs/python37/lib/python3.7/site-packages/omegaconf/omegaconf.py", line 669, in open_dict prev_state = config._get_node_flag("struct") AttributeError: 'Namespace' object has no attribute '_get_node_flag'
您好,我理解的p-tuning的原理是冷冻后面大语言模型,只调整前面的prompt embedding模型。但是在您代码的实现中(https://github.com/THUDM/P-tuning/blob/main/PT-Fewshot/pet/wrapper.py 中optimizer部分) 同时对后面大语言模型的参数进行了微调,想问下这部分是我理解错了吗
https://cloud.tsinghua.edu.cn/f/21b9dcf05cc44adfad25/?dl=1 is broken. Could anyone fix it? Thanks.
I am confused about this sentence in your papar of "GPT Understands, Too": **Moreover, in the inference, we only need the output embedding h and can discard the LSTM head.**...
以CB数据集在论文p-tuning中bert-base-cased上面报告的为例ACC---89.2 ,F1---92.1 然后论文提到这句话 MP zero-shot and MP fine-tuning report results of a single pattern, while anchors for P-tuning are selected from the same prompt. 是指MP-zero-shot 和 MP fine-tuning p-tuning都使用同一个pattern 进行报告结果吗?...
``` if 'gpt' not in self.args.model_name and 'megatron' not in self.args.model_name: # BERT-style model(bert风格句子构造, CLS开头, SEP结尾) return [[self.tokenizer.cls_token_id] # [CLS] + prompt_tokens * self.template[0] + [self.tokenizer.mask_token_id] # head entity +...