DialoGPT icon indicating copy to clipboard operation
DialoGPT copied to clipboard

Large-scale pretraining for dialogue

Results 64 DialoGPT issues
Sort by recently updated
recently updated
newest added

I am reading through the code base and paper and am trying to understand where in the code the MMI criterion is implemented and used. My guess is that during...

Thank you for this model, amazing results. I would like to take the pretrained model and fine-tune it on r/advice. Such that it generates responses more similar to conversations in...

I was retraining the model on my own dataset and with a single GPU Training Command ``` export CUDA_VISIBLE_DEVICES=3 python LSP_train.py --model_name_or_path ./models/small --init_checkpoint None --init_weights true --train_input_file ./data/train_opensub_qa_dialogpt.128len.db --eval_input_file...

Hello. Large model trained from scratch has wrong config, resulting in errors below: `RuntimeError: Error(s) in loading state_dict for GPT2LMHeadModel: Missing key(s) in state_dict: "transformer.h.36.ln_1.weight", "transformer.h.36.ln_1.bias", ... , "transformer.h.47.mlp.c_proj.weight", "transformer.h.47.mlp.c_proj.bias"....

https://github.com/microsoft/DialoGPT/blob/b85558dea5391f83b20120d6c93b9f79fcc72311/reddit_extractor/src/reddit.py#L108-L112

link in the readme ("available on azure blobstorage here.") which points to "https://convaisharables.blob.core.windows.net/lsp" produces this: `This XML file does not appear to have any style information associated with it. The...

Hello DialoGPT team (@mattetti @mtodd @sverrejoh @radical )! Thank you for your work on DialoGPT. This project is interesting, and we think that it would be a great addition to...

i have a dataset like A :XX B:YY A: ZZ B:SS how to train and eval dataset like this ?

hello, I would be grateful if someone answer this question clearly: **Can dialogpt finetuned on model other than GPT-2, if so, how?**. I tried to finetune this model to GPT-J,...

It seems that https://files.pushshift.io/reddit/ unable to access? Is there any way to download the full dataset? DialoGPT-master/reddit_extractor/logs# cat RC_2006-01.zst.log --2023-06-05 15:26:33-- https://files.pushshift.io/reddit/comments/RC_2006-01.zst Resolving files.pushshift.io (files.pushshift.io)... 104.21.83.88, 172.67.219.85, 2606:4700:3036::6815:5358, ... Connecting...