SentimentVAE Why the output from the reconstruction is always a single word?

I tried to run your code in a small size sample. Here is one line of the file train.csv:

0,"if you enjoy service by someone who is as competent as he is personable , i would recommend corey kaplan highly . the time he has spent here has been very productive and working with him educational and enjoyable . i hope not to need him again  though this is highly unlikely  but knowing he is there if i do is very nice . by the way , i m not from el centro , ca . but scottsdale , az . "

which is supposed to be in right format. And I set the display_every and print_every to 2. Other hyper-parameters are remained the same. The output of the reconstruction is as following:

Sentences generated from encodings
     Sentence 0: deal
     Sentence 1: very
     Sentence 2: Service,
     Sentence 3: Mint,
     Sentence 4: I
     Sentence 5: Your
     Sentence 6: people
     Sentence 7: eyes.
     Sentence 8: that
     Sentence 9: Chinatown?
     Sentence 10: said.
     Sentence 11: I
     Sentence 12: that
     Sentence 13: atmosphere
     Sentence 14: I
     Sentence 15: First
     Sentence 16: possible,
     Sentence 17: recap.
     Sentence 18: I
     Sentence 19: I
     Sentence 20: account.
     Sentence 21: you
     Sentence 22: I
     Sentence 23: filthy
     Sentence 24: very
     Sentence 25: Will
     Sentence 26: I
     Sentence 27: yellow

I wonder if I did anything wrong or it supposed to be like this? Do you have any idea about it? Thanks in advance!

Apr 27 '17 18:04 ghost

Here is the detailed information:

Config:
anneal_bias       6500
anneal_max        1.0
autoencoder       True
batch_size        28
beam_size         16
conv_width        5,5,3
convolutional     False
data_path         /data/yelp
debug             False
decoder_inputs    True
decoding_noise    0.1
display_every     2
dropout_finish    13000
dropout_start     4000
encoder_birnn     True
encoder_summary   mean
gpu_id            -1
group_length      True
hidden_size       512
init_dropout      0.95
keep_fraction     1
label_emb_size    3
latent_size       64
learning_rate     0.001
length_penalty    100.0
load_file         
max_epoch         10000
max_gen_length    50
max_grad_norm     5.0
max_length        None
max_steps         9999999
mutinfo_weight    0.0
mutual_info       True
num_layers        1
optimizer         adam
print_every       2
save_every        -1
save_file         models/recent.dat
save_overwrite    True
softmax_samples   1000
test_ll_samples   6
test_validation   True
training          True
use_labels        False
val_ll_samples    3
validate_every    1
variational       True
vocab_file        vocab
word_dropout      0.5
word_emb_size     224

Loading vocabulary from pickle...
Vocabulary loaded, size: 3305
Loading train csv
Loading train data from pickle...
Training samples = 100
Loading validation csv
Loading validation data from pickle...
Validation samples = 100
Loading test csv
Loading test data from pickle...
Testing samples = 100

Thanks!

Apr 27 '17 18:04 ghost

Hey, we never really managed to get this to work. If you would like to read the report on this we can send it to you via email.

Apr 28 '17 00:04 vighneshbirodkar

Actually here it is, I just realized it is public. http://www.cs.nyu.edu/~akv245/inf/writeup.pdf

Apr 28 '17 02:04 vighneshbirodkar

@vighneshbirodkar Thanks for reply. But in the Experiment section of the report, seems that you have generated plausible results by both VAE with KL-Divergence annealing and Variational autoencoder with mutual information. And I would like to reproduce the result.

Indeed I have read the report before I found this code. And I like your idea no matter it works practically.

Apr 28 '17 04:04 ghost

The results there were from the Yelp Review dataset.

Apr 28 '17 06:04 vighneshbirodkar

There is already code to pre-process the yelp review dataset and train with it. Let us know if you have any trouble.

Apr 28 '17 06:04 vighneshbirodkar

I just notices that your dataset size is 100. You should really be training with the whole Yelp dataset

Apr 28 '17 06:04 vighneshbirodkar

@vighneshbirodkar Yes, I did use part of the yelp data set to train and I did use the scripts to preprocess them. But I don't know why it can generate complete sentence with whole dataset but can't with part of it.

Apr 28 '17 06:04 ghost

Here are the files I have tried to run the code with. Could you please give a try to see whether you can generate complete sentences? I am afraid I had done something wrong. Archive.zip

Apr 28 '17 07:04 ghost

Because it just hasn't seen enough data to learn anything useful.

Apr 28 '17 07:04 vighneshbirodkar

@vighneshbirodkar I can not run the whole data set with more than four millions lines on my macbook. This time I use the first 100,000 lines. But it still generates one word per sentence. I think it is not a matter about the size of the data. When the data set is small, it should learn something overfitting but it still can generate a complete sentence. I wonder whether you get the result in your report with exactly this repo? If there was newer version, could you please provide it to me? Thanks a lot!

Apr 28 '17 12:04 ghost

The repo is exactly what we used in the report. Over-fitting won't work like that in this case because generation is done using beam search. Let me see if I can re run this with the whole dataset

Apr 28 '17 15:04 vighneshbirodkar

@vighneshbirodkar Could you please also tried using, say, the head 100,000 lines of data, which is supposed to be able to train a not-so-bad model? In addition, may I ask how many GPUs you have used?

Apr 28 '17 15:04 ghost

When I run the code, it says that IOError: [Errno 2] No such file or directory: 'data/yelp/vocab.0.970.pk'

Where can I find/generate this file?

Jun 30 '17 01:06 JianLiu91

@vighneshbirodkar

do you have a paper associated to this repo

Sep 08 '17 07:09 yiqingyang2012

@rylanchiu , do you meet with the problem of NAN loss in the training? How to solve it?

Sep 16 '17 06:09 wangyong1122

@wangyong1122 Seems yes. And I already gave up this repo. I don't think I can reproduce the result with this code.

Sep 29 '17 05:09 ghost

SentimentVAE SentimentVAE copied to clipboard

Why the output from the reconstruction is always a single word?

SentimentVAE
SentimentVAE copied to clipboard