gpt-2-tensorflow2.0 sg.sample_sequence returns context after pre-trained model

sg.sample_sequence returns context after pre-trained model

Open bytes-commerce opened this issue 4 years ago • 3 comments

First ofd all, thanks for providing this amazing repository providing a possibility for tf2! Secondly, I were using the Readme to pre-train my model and eventually using sequence_generator.py to pass some context to the model.

However, the response is always 1:1 the same as the context but the capital letters are being replaced with ??s. The question now is, what am I doing wrong? Have I maybe forgotten a thing? Is there maybe a edge case leading to this point that could be prevented?

Please let me know any additional information you might need! Thanks a lot!

Oct 06 '20 19:10 bytes-commerce

same problem

Oct 13 '20 18:10 jzl0166

also getting weird output like this.

Nov 21 '20 02:11 jspangl3r

First of all, thank you for sharing your code! Helped me a lot starting with gpt2. I really do not know if this is relevant but I just debugged sample.py.

output will only append zeros: tf.Tensor([[ 3 13727 5825 0 0 0 0 0 ...]], shape=(1, 515), dtype=int32)

If my sequence length is 512 — I will get 512 zeros (+3 above zero numbers because of my context). My output is just the words I have provided as context because the rest is 0.

edit 1: logits is always nan in my case resulting in 0.

edit 2: self.embedding_weights is nan. Maybe somethings wrong with the initializer?

Dec 11 '20 02:12 vedranbajic

gpt-2-tensorflow2.0 gpt-2-tensorflow2.0 copied to clipboard

sg.sample_sequence returns context after pre-trained model

gpt-2-tensorflow2.0
gpt-2-tensorflow2.0 copied to clipboard