DialoGPT icon indicating copy to clipboard operation
DialoGPT copied to clipboard

Large-scale pretraining for dialogue

Results 64 DialoGPT issues
Sort by recently updated
recently updated
newest added

When I run distributed training with more than one GPU, training gets stuck at the very beginning and hangs indefinitely. It is stuck in FP16_Optimizer#set (specifically at [this line](https://github.com/NVIDIA/apex/blob/3d01e4a0a188cc8df54bc6e44cf5eb40ff6b4cc5/apex/optimizers/fp16_optimizer.py#L122), where...

How can we use our own Data to train the model??

My dataset is a .txt file where each line represents an entire dialogue where inside each turn is separated by a tab. How can I convert it to your tour...

Hey guys! Awesome work. Can you please clarify is there a reason train model with data contains not only N-turn samples if I want to use model in the N-turn...

I planned to train it from scratch with my past 10'000 emails (not in english), do you think it would make sense with such amount of training data?

Great job! Thanks for your contributions to dialogue generation! Is there any way that I can get the 27G dialogue Reddit data(147,116,725 dialogue instances) without running demo.py?

Hi! I ran the training script on 130 million training instances and I got the following training speed: 1 V100 GPU, FP16 O2, ~14k tokens/sec, ~100 hours 8 V100 GPUs,...

Hey guys! Great work! I really appreciate it! After reading the code, I noticed that the training data is from 12/2015 to 11/2017, while the test data is from 03/2018...

Can you guys share the hyperparameters of different model sizes i.e. small, medium, and large? https://github.com/microsoft/DialoGPT/blob/75a4197188a1addf22c5eaea23f16d3b598635d7/LSP_train.py#L46-L82

It is really great work. I wonder if you could share the hyperparameter that is used to pre-train the DialoGPT, especially the hyperparameters for GPT-small