burn
burn copied to clipboard
implement bidirectional lstm
I need bidirectional lstm in CRNN model.
Checklist
- [x] Confirmed that
run-checks allscript has been executed. - [x] Made sure the book is up to date with changes in this PR.
Codecov Report
Attention: Patch coverage is 99.45504% with 2 lines in your changes are missing coverage. Please review.
Project coverage is 86.51%. Comparing base (
2f294c5) to head (56f6d7c). Report is 3 commits behind head on main.
| Files | Patch % | Lines |
|---|---|---|
| crates/burn-core/src/nn/rnn/lstm.rs | 99.45% | 2 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## main #1035 +/- ##
==========================================
+ Coverage 86.41% 86.51% +0.09%
==========================================
Files 696 696
Lines 81131 81499 +368
==========================================
+ Hits 70112 70508 +396
+ Misses 11019 10991 -28
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
I think the current implementation of bidirectional lstm can work during inference, but the implementation cannot update the parameters of the model in backward propagation due to #1098. Using Tensor::cat instead of Tensor::slice_assign can solve this problem, but it seems to have some negative performance impact, so I think it's best to wait for the Tensor::slice_assign bug to be resolved for now.
Closing this ticket and linking to this ticket: https://github.com/tracel-ai/burn/issues/1537. So someone else can pick up.
@antimora Could you reopen this pull request? The issue regarding Autodiff has been resolved in #1575, and I think it's time to proceed with implementing the Bidirectional LSTM.
Wgpu test failed, but I don't know why...
What was the error in wgpu?
Hm ok well on ubuntu-22.04 everything seems fine when I run the tests locally. Given that ubuntu and windows work, maybe it hints at possible compatibility issues or differences in how Metal handles resource management compared to Vulkan (ie Linux) or DirectX (Windows)?
The outputs of both Lstm and BiLstm are now aligned with PyTorch.
The script I used to generate tests for Bidirectional LSTM can be found here.
The tests for the wgpu backend still failed, possibly due to some data race issues in the wgpu backend?
@nathanielsimard @louisfd I changed the batch size to 2, and the test passed. This seems to be a bug with the Wgpu backend, so I've opened an issue #1656.
We will merge it, once we have an approval from @laggui