transfer-learning-conv-ai
transfer-learning-conv-ai copied to clipboard
🦄 State-of-the-Art Conversational AI with Transfer Learning
Hello. I am trying to run your model and I have some confusion in your pre-trained model. It seems that ``train.py`` trained the model with doublehead model, but in the...
Hello! I tried the different setting of your model, for example, changed the token level loss to sentence level loss. And used the beam search as you mentioned. But the...
Hi I am using python 3.6, and I run python train.py --model_checkpoint pretrained_transformers/gpt --dataset_path datasets/personachat_self_original.json thanks INFO:/dev/ccn/generation/transfer-learning-conv-ai/utils.py:Tokenize and encode the dataset Traceback (most recent call last): File "train.py", line 271,...
Previously, each sequence was padded to the length of the longest sequence in the *dataset*. In this PR, each *batch* is padded to the length of the longest sequence in...
why num_candidates set to min(args.num_candidates, len(dataset[0]["utterances"][0]["candidates"])?
In train.py, starting from line number 81- `for dataset_name, dataset in personachat.items():` `num_candidates = len(dataset[0]["utterances"][0]["candidates"])` `if args.num_candidates > 0 and dataset_name == 'train':` `num_candidates = min(args.num_candidates, num_candidates)` Please explain me...
Hi Team, spoke with @thomwolf about possibly using Lightning as your backend! This would remove the need to do your own distributed computing and 16-bit stuff. Check out the simple...
Hi team thank you very much for the great work and the clean code! I got some problem while running the code and was wondering if you could give me...
get_dataset.tokenize() on a single CPU is very slow. Therefore in this pull request it is upgraded to multiprocessing by implementing the multiprocessing target function worker_tokenize(args_list). Additionally a multiprocessing debug logger...
1. when you call get_dataset_personalities(tokenizer, args.dataset_path, args.dataset_cache) it parser from personachat_self_original.json which contains the whole training set and take long time. Think it's better to sample from a smaller file....
This [chat](https://convai.huggingface.co/persona/my-favorite-jello-is-the-blue-one-i-ve-long-red-hair-i-don-t-eat-asparagus-i-work-at-home-on-my-computer) doesn't look fine to me. 