CDial-GPT icon indicating copy to clipboard operation
CDial-GPT copied to clipboard

A Large-scale Chinese Short-Text Conversation Dataset and Chinese pre-training dialog models

Results 28 CDial-GPT issues
Sort by recently updated
recently updated
newest added

gpt2模型config里面n_positions=513,会报 size mismatch for transformer.h.0.attn.bias: copying a param with shape torch.Size([1, 1, 512, 512]) from checkpoint, the shape in current model is torch.Size([1, 1, 513, 513]). 改成512后,如果use_gpt2=True,会报 size mismatch for...

Iter (loss= nan) lr=0.0001875: 0%| | 5/60000 [00:09

python train.py --pretrained --model_checkpoint thu-coai/CDial-GPT_LCCC-large --data_path data/STC.json --scheduler linear。 你好请问我的内存明明是够的,它为啥还报这个错误呢。batch_size我也改成了1. RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 10.73 GiB total capacity; 904.23 MiB already allocated;...

When I run python train.py,I got the error : RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)`. I had installed the packages via "pip install -r requirement“,and try to change the...

直接用CDial-GPT2_LCCC-base 预测 预测那部分代码有修改,不然跑不通 output = model(input_ids, token_type_ids=token_type_ids) logits = output.logits logits = logits[0, -1, :] / args.temperature 不管输入是什么,结果都如下 [12997, 7635, 12997, 7635, 12997, 12997, 12997, 12997, 7635, 12997, 7635, 12997,...

文章中说学习率最大6.25e-5,noam schedule更小,要用这么小的学习率吗?

The guidance mentioned that we can download STC dataset, anyone knows where to download this dataset?

可以告诉我论文中ppl的指标怎么计算嘛? 具体代码在什么位置,怎么操作呢?

在加载预训练模型时出现这个错误,请问这个错误是怎么回事?