debug_seq2seq
debug_seq2seq copied to clipboard
sanity check
I came up with the following sanity check to ensure that the implementation and word embeddings etc are good.
I created a dataset of 100,000 lines, that has the following 6 lines repeated over and over again:
hi . $$$
hi , joey . $$$
hello ? $$$
who are you ? $$$
what are you doing ? $$$
nothing much . you ? $$$
I then ran your code with the following parameters and model:
TOKEN_REPRESENTATION_SIZE = 32 # word2vec parameter
HIDDEN_LAYER_DIMENSION = 4096 # number of nodes in each LSTM layer
seq2seq = Seq2seq(
batch_input_shape=(SAMPLES_BATCH_SIZE, INPUT_SEQUENCE_LENGTH, TOKEN_REPRESENTATION_SIZE),
hidden_dim = HIDDEN_LAYER_DIMENSION,
output_length=ANSWER_MAX_TOKEN_LENGTH,
output_dim=token_dict_size,
depth=2,
dropout=0.25,
peek=True
)
opt=adagrad(clipvalue=50)
model.compile(loss='sparse_categorical_crossentropy', optimizer=opt, metrics=["accuracy"])
After 10 data passes, my result look like this:
INFO:lib.nn_model.train:[hi. ] -> [$$$ doing who who $$$ $$$ $$$]
INFO:lib.nn_model.train:[hello ?] -> [$$$ doing who who $$$ $$$ $$$]
INFO:lib.nn_model.train:[who are you ?] -> [$$$ doing who who $$$ $$$ $$$]
INFO:lib.nn_model.train:[what are you doing ?] -> [$$$ doing who who $$$ $$$ $$$]
So basically, the sanity check fails. The model can't even learn the answer to these 6 lines, even though they were repeated so many times. Does anyone know why this is happening? What could be the problem?
Hi, the problem is with the embeddings.. I am working on an example chatbot for seq2seq, will take some time.
I see. I look forward to your example. Meanwhile, I will try other embeddings to see if it improves the results. Thanks!
@farizrahman4u Hi, I ran into the same problem here. Any suggestion or any update on this matter?
Thanks.