burn icon indicating copy to clipboard operation
burn copied to clipboard

implement bidirectional lstm

Open wcshds opened this issue 2 years ago • 2 comments

I need bidirectional lstm in CRNN model.

Checklist

  • [x] Confirmed that run-checks all script has been executed.
  • [x] Made sure the book is up to date with changes in this PR.

wcshds avatar Dec 01 '23 19:12 wcshds

Codecov Report

Attention: Patch coverage is 99.45504% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 86.51%. Comparing base (2f294c5) to head (56f6d7c). Report is 3 commits behind head on main.

Files Patch % Lines
crates/burn-core/src/nn/rnn/lstm.rs 99.45% 2 Missing :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1035      +/-   ##
==========================================
+ Coverage   86.41%   86.51%   +0.09%     
==========================================
  Files         696      696              
  Lines       81131    81499     +368     
==========================================
+ Hits        70112    70508     +396     
+ Misses      11019    10991      -28     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Dec 20 '23 04:12 codecov[bot]

I think the current implementation of bidirectional lstm can work during inference, but the implementation cannot update the parameters of the model in backward propagation due to #1098. Using Tensor::cat instead of Tensor::slice_assign can solve this problem, but it seems to have some negative performance impact, so I think it's best to wait for the Tensor::slice_assign bug to be resolved for now.

wcshds avatar Jan 12 '24 12:01 wcshds

Closing this ticket and linking to this ticket: https://github.com/tracel-ai/burn/issues/1537. So someone else can pick up.

antimora avatar Mar 26 '24 23:03 antimora

@antimora Could you reopen this pull request? The issue regarding Autodiff has been resolved in #1575, and I think it's time to proceed with implementing the Bidirectional LSTM.

wcshds avatar Apr 16 '24 12:04 wcshds

Wgpu test failed, but I don't know why...

wcshds avatar Apr 16 '24 14:04 wcshds

What was the error in wgpu?

agelas avatar Apr 16 '24 19:04 agelas

Hm ok well on ubuntu-22.04 everything seems fine when I run the tests locally. Given that ubuntu and windows work, maybe it hints at possible compatibility issues or differences in how Metal handles resource management compared to Vulkan (ie Linux) or DirectX (Windows)?

agelas avatar Apr 16 '24 22:04 agelas

The outputs of both Lstm and BiLstm are now aligned with PyTorch.

The script I used to generate tests for Bidirectional LSTM can be found here.

The tests for the wgpu backend still failed, possibly due to some data race issues in the wgpu backend?

wcshds avatar Apr 18 '24 04:04 wcshds

@nathanielsimard @louisfd I changed the batch size to 2, and the test passed. This seems to be a bug with the Wgpu backend, so I've opened an issue #1656.

wcshds avatar Apr 18 '24 10:04 wcshds

We will merge it, once we have an approval from @laggui

antimora avatar Apr 26 '24 16:04 antimora