im2latex icon indicating copy to clipboard operation
im2latex copied to clipboard

Beam Search finalize() bug: not update parents in the "body"

Open AaronYALai opened this issue 6 years ago • 5 comments

https://github.com/guillaumegenthial/im2latex/blob/8e25d7ec2097e2c6515bbb5b41e8f16b79339967/model/components/beam_search_decoder_cell.py#L247

The "body" function for the tf.while_loop extracts final decoding results time step by time step.

But the state "parents" has not been updated in the body function!

def body(time, outputs_ta, parents): 
    ... (no update of parents) ...
    return (time + 1), outputs_ta, parents

This should be as the following:

return (time + 1), outputs_ta, input_t.parents

since parents for the next step are stored in "input_t" which is extracted for the current time step.

AaronYALai avatar May 08 '18 05:05 AaronYALai

@AaronYALai maybe you are right, i modified the beaming search decoding according to your comment, the performance has 5% improvement on my own dataset.

interxuxing avatar May 11 '18 06:05 interxuxing

@AaronYALai @interxuxing I think it should the next parents should be gather_helper(input_t.parents, parents) as the parent_idx is the traceback to the last timestep's parent at each timestep.

JunweiLiang avatar Oct 30 '19 13:10 JunweiLiang

@AaronYALai @interxuxing I think it should the next parents should be gather_helper(input_t.parents, parents) as the parent_idx is the traceback to the last timestep's parent at each timestep.

你好,你最近也在也就公式图片到latex么,我跑了这个代码,发现EM指标只有51%,没有达到论文中的76%,这份代码是不是与论文的里提到的方法不一样。我没有看到本篇文章使用了Row encoder 的方法

kim-yhow avatar Nov 07 '19 01:11 kim-yhow

@AaronYALai @interxuxing I think it should the next parents should be gather_helper(input_t.parents, parents) as the parent_idx is the traceback to the last timestep's parent at each timestep.

你好,你最近也在也就公式图片到latex么,我跑了这个代码,发现EM指标只有51%,没有达到论文中的76%,这份代码是不是与论文的里提到的方法不一样。我没有看到本篇文章使用了Row encoder 的方法

你的EM51是跑了迭代次数多少 ,衰减多少,我按照默认配置跑6+13(decay),em只有36.57+%

pageedward avatar Jun 18 '21 07:06 pageedward

@AaronYALai @interxuxing I think it should the next parents should be gather_helper(input_t.parents, parents) as the parent_idx is the traceback to the last timestep's parent at each timestep.

你好,你最近也在也就公式图片到latex么,我跑了这个代码,发现EM指标只有51%,没有达到论文中的76%,这份代码是不是与论文的里提到的方法不一样。我没有看到本篇文章使用了Row encoder 的方法 你的EM51是跑了迭代次数多少 ,衰减多少,我按照默认配置跑6+13(decay),em只有36.57+%

pageedward avatar Jun 21 '21 07:06 pageedward