Yoon Kim comments

Results 16 comments of


                                            Yoon Kim

Validation perplexity is 146.71 at the end of training (24 epochs)

Cool stuff! I noticed on the README that you are using 100/150 hidden units for small/large models respectively. I actually use 300/650 hidden units, so this might explain the difference...

Validation perplexity is 146.71 at the end of training (24 epochs)

Ah ok! Few other things may be: - batch size - parameter initialization

Validation perplexity is 146.71 at the end of training (24 epochs)

I think it should be a lot lower. I don't recall the numbers exactly but since the dataset is small and the model has a lot of capacity (even with...

sentence length and padding

it's because we do SGD with mini-batches, and each mini-batch has sentences of varying lengths. one could sort/group the batches based on sentence length and then there would be no...

NotImplementedError: The image and the kernel must have the same type.inputs

Feel free push if you've modified to code get GPU working, and I'll make sure to merge :)

MLPDropoutLayer seems has no hidden layer ?

That's correct, we go directly from the CNN output to the softmax, without any hidden layers.

How could get the value of y_pred? just like, the value of test_set_y

You want to create (and compile) a theano function whose output is y_pred, given the input.

set sentence max length automatically

cool, feel free to send a pull request!

question regarding datasets

Hi, you can obtain all the datasets here: https://github.com/harvardnlp/sent-conv-torch Phrases were not taken into account from word2vec.

Reconstruction of table 6 from paper - Dealing with OOV words

There is randomness built into the models (due to initialization) so you shouldn't expect the nearest neighbors to be exactly the same. Your nearest neighbors seem to make sense (and...