Yang Tian

Results 3 issues of Yang Tian

PDF版笔记P67 “接下来一个操作,让我们把A设为 A = [A, [100, 101, 102]]” 应改为: “接下来一个操作,让我们把A设为 A = [A, [100; 101; 102]]” 感谢您做出的贡献。

https://github.com/Kyubyong/transformer/blob/2234b4268355ccc073d8f0535fdd395289001151/model.py#L168 I know this line is used to end a inference, but y_hat.shape=[N, ?], and tf.reduce_sum(y_hat, 1) get a shape=[N], how to compare with `self.token2idx[""]`(a int value)?

Like the title, i just want to train gpt-2 model with my dataset, rather than re-training or fine-tuning on a base model. Looking forward ur reply!