wenhu chen issues

Results 8 issues of


wenhu chen

Problem about preprocessing

I found the the splitting step is done after shuffle according to https://github.com/karpathy/neuraltalk2/blob/master/prepro.py (line 159), which means every time we get different splits for val/test set. Even after I removed...

Reload saved model caused performance drop by more than 3 percent

Hi, I tried to reload the whole model by adding saver.restore(sess, "model/*") after sess.run(tf.global_variables_initializer()), I notice that every time it restored the model, the performance will be much lower. Could...

Question about implementing large vocabulary in Blocks

Do you have any plan to implement large vocabulary in blocks? I will appreciate if you could share it.

Question about large vocabulary parameter

``` def init_extra_parameters(model, state): # May want to add skip_init later model.large_W_0_enc_approx_embdr = eval(state['weight_init_fn'])(state['large_vocab_source'], state['rank_n_approx'], -1, state['weight_scale'], model.rng) model.large_W_0_dec_approx_embdr = eval(state['weight_init_fn'])(state['large_vocab_target'], state['rank_n_approx'], -1, state['weight_scale'], model.rng) model.large_W2_dec_deep_softmax = eval(state['weight_init_fn'])(state['rank_n_approx'], state['large_vocab_target'], -1,...

Split problem

I found the split of test/val is different from what is given in karpathy/neuraltalk2. According to their script https://github.com/karpathy/neuraltalk2/blob/master/coco/coco_preprocess.ipynb They tried to get first 5000 as val, 5000-10000 as test...

Is this the official code for the EMNLP best paper

Hi, thanks for sharing your code. I'm curious whether this is the official code for the EMNLP best paper, I managed to get BLEU of ~52, which is lower than...

Optimizing the memory for lower-end GPUs

Hi there, Have you tested on lower-end gpus for the model? I tried the code and it doesn't seem to work on A6000, not to mention on A10, etc. Is...

Unable to reproduce MATH resulst

Hi there, thanks for sharing gemma. But it seems that I can't reproduce the MATH 24% 4-shot accuracy. I'm only getting 20% now. Is there anyone trying to reproduce that?...

type:support

stat:awaiting internal