RobustVideoMatting GRU fix

GRU fix

Open Jerry-Master opened this issue 2 years ago • 2 comments

Looking at your formulas in the article I see your implementation of the GRU does not coincide with the code you provide. I don't want you to merge this fork since it would break compatibility. But I leave it here in case you want to discuss the performance of this fixed ConvGRU implementation. It seems you are recycling the hidden states as if it was the forward activation. It is a valid approach, but I see more reasonable to separate between hidden state and forward activation.

Aug 29 '23 12:08 Jerry-Master

Unlike LSTM, GRU by design does not have separate hidden and forward output. They share the same. See this diagram.

The (1 - z) was opposite to the paper notation but they are equivalent. So I believe my original implementation was correct.

Aug 29 '23 18:08 PeterL1n

I mean, you say in the article ot is the output of the layer and h is the hidden state. So it makes sense that you pass the output to the next layer and the hidden state to the next time step. I was wondering if you tried, or have some intuition on which option is better in performance because, computationally, they are very similar.

Aug 29 '23 19:08 Jerry-Master

RobustVideoMatting RobustVideoMatting copied to clipboard

GRU fix

RobustVideoMatting
RobustVideoMatting copied to clipboard