gpt-2-colab
gpt-2-colab copied to clipboard
How to prepare the data for text generation task. Thank you very much.
First, I'm not sure whether the model contains the encoder during training.
EOS means end-of-sentence. Encoder and decoder are part of transformer network.
If without-encoder, training time:
target: [E, F, G, H, EOS]
decoder input: [0, E, F, G, H]
If without-encoder, testing time:
decoder input: [0]
If with encoder, training time:
encoder input: [A, B, C, D]
target: [E, F, G, H, EOS]
decoder input: [0, E, F, G, H]
If with-encoder, testing time:
encoder input: [A, B, C, D]
decoder input: [0]
Am I exact right?
I know it is beyond the topic of this project, but hope you could help. Thank you and thank you.
I have the problem too. For translation, I know the input and target, but I am not sure the input and target for the language model.
Would also love to know how to properly prepare data for the model, I am quite new but this is very exciting to work with
I am also wondering if we have different types of data. For example lyrics and recipes. Do we just deploy trained data inside 345M model and run it or we should separate those trained data and use only data for lyrics when we want lyrics and use recipes trained data only when we want to generate recipes?
And if that is the case, how do we do that in simple way?