examples
examples copied to clipboard
language model generator question
In this file:
https://github.com/pytorch/examples/blob/master/word_language_model/generate.py
What does this input mean in the generation?
input = torch.randint(ntokens, (1, 1), dtype=torch.long).to(device)
As I understand it in a rnn-based language model, the last output of the rnn is fed into the current input and the sequence is unrolled. What is the meaning of this random input? Does it enforce the last output is being fed into the current input in the unrolling?
Thanks!
(I am building a sequence generator that needs to consume its output from the last input, and I am wondering how to do it. Are you suggesting just feeding in random input would also work? Any hints would be helpful ! )
This input
tensor is used to sample the dictionary - to randomly choose the first word in the input sequence.
The next time input
is used, it is already after it was set to the output of the RNN:
output, hidden = model(input, hidden)
word_weights = output.squeeze().div(args.temperature).exp().cpu()
word_idx = torch.multinomial(word_weights, 1)[0]
input.fill_(word_idx)
I hope that helps, Neta
I see. But this assumes uniform distribution on the first word, which isn't what a language model is right? Shouldn't the first input always be
For instance, it is very unlikely any sentence would start with the word "unfortunate"
I'm perfectly okay with the answer "yeah but who cares it's easier this way", which is what I would've done too, technically bit incorrect but who cares. Is that the case?