CDial-GPT issues

模型维度问题

5

gpt2模型config里面n_positions=513，会报 size mismatch for transformer.h.0.attn.bias: copying a param with shape torch.Size([1, 1, 512, 512]) from checkpoint, the shape in current model is torch.Size([1, 1, 513, 513]). 改成512后，如果use_gpt2=True，会报 size mismatch for...

terminator123

启用半精度后，训练loss = nan

Iter (loss= nan) lr=0.0001875: 0%| | 5/60000 [00:09

WuDiDaBinGe

RuntimeError: CUDA out of memory.

12

python train.py --pretrained --model_checkpoint thu-coai/CDial-GPT_LCCC-large --data_path data/STC.json --scheduler linear。你好请问我的内存明明是够的，它为啥还报这个错误呢。batch_size我也改成了1. RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 10.73 GiB total capacity; 904.23 MiB already allocated;...

Deerzh

RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)`

2

When I run python train.py,I got the error : RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)`. I had installed the packages via "pip install -r requirement“,and try to change the...

RyanYip-Kat

预测问题

2

直接用CDial-GPT2_LCCC-base 预测预测那部分代码有修改，不然跑不通 output = model(input_ids, token_type_ids=token_type_ids) logits = output.logits logits = logits[0, -1, :] / args.temperature 不管输入是什么，结果都如下 [12997, 7635, 12997, 7635, 12997, 12997, 12997, 12997, 7635, 12997, 7635, 12997,...

terminator123

训练资源和时间没那么充裕，可以提供下跑完的模型结果吗

4

我的邮箱 [email protected]，可以麻烦提供一份吗，感谢？

jiangliqin

学习率的问题?学习率最大6.25e-5

2

文章中说学习率最大6.25e-5，noam schedule更小，要用这么小的学习率吗？

WuDiDaBinGe

seeking STC dataset

3

The guidance mentioned that we can download STC dataset, anyone knows where to download this dataset?

Zhou-Zoey

评估指标ppl

可以告诉我论文中ppl的指标怎么计算嘛？具体代码在什么位置，怎么操作呢？

xiao-ming-code

_pickle.UnpicklingError: invalid load key, 'v'.

3

在加载预训练模型时出现这个错误，请问这个错误是怎么回事？

cuiding

CDial-GPT
CDial-GPT copied to clipboard

Metadata

模型维度问题

启用半精度后，训练loss = nan

RuntimeError: CUDA out of memory.

RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)`

预测问题

训练资源和时间没那么充裕，可以提供下跑完的模型结果吗

学习率的问题?学习率最大6.25e-5

seeking STC dataset

评估指标ppl

_pickle.UnpicklingError: invalid load key, 'v'.

← Metadata

Owner

Metadata

CDial-GPT CDial-GPT copied to clipboard

Metadata

← Metadata

Owner

Metadata

CDial-GPT
CDial-GPT copied to clipboard