DialoGPT icon indicating copy to clipboard operation
DialoGPT copied to clipboard

Large-scale pretraining for dialogue

Results 64 DialoGPT issues
Sort by recently updated
recently updated
newest added

Hey I got the hugging face GPT2 Large model of Dialog, I guess pre trained? and I have tried to ask it questions and it seems like it's not really...

I am not satisfied with the responses that DialoGPT produces -- for the most part, they seem pretty random and AI-ish to me. I fine-tuned the model with my dataset...

https://github.com/microsoft/DialoGPT/blob/fa0c0c53a0e6d75b6541e50faa2d77ba480b27d9/LSP_train.py#L281 Since it is a LMHeadModel, the `1^th`-`n^th` tokens are used to predict the `(n+1)^th` token during training, so why not introduce attention_mask for masking the `(n+2)^th`-`(n+m)^th` tokens. Without attention_mask,...

After `python demo.py` was created dir `models/output_model/GPT2.1e-05.64.0gpu.2020-07-19022953/GP2-pretrain-step-10000.pkl` Question: How create simple interactive program to use GP2-pretrain-step-10000.pk. Something like on: [https://huggingface.co/microsoft/DialoGPT-small](url) `from transformers import AutoModelWithLMHead, AutoTokenizer import torch tokenizer = AutoTokenizer.from_pretrained("microsoft/DialoGPT-small")...

Hi, great thanks to your contribution! I try to use `python demo.py --data full` to download the reddit data. For I don't want to train the model now I didn't...

Sample and full training and testing data contain tokenized sentences (by TweetTokenizer I suppose): `what are you doing for a living ? i am a admin .` instead of not...

Hi, Thank you for the implementation. While running the demo.py file, I encountered an error saying "No such file or directory". Can you help with the same? TIA.

run python demo.py --data small error occurred, b'gzip: ./train.tsv.gz: No such file or directory\n' in makefile ,i think train.tsv.gz will be reduced in reddit_extract/data/out/ am i wrong ? thank you

Hi, I have read paper of DialogGPT. And it is said the model is trained on multi turn data, but I can't see the statistics about ave and max turn...

I have seen there is "weight" as part of input, 1.0 or 0.0 for a sentence. But in LSP_train.py this feature is not used. Where and how are they used?