ludwig
ludwig copied to clipboard
fix: Copy modules and alias weights data when using tied weights
The current tied weights implementation aliases torch modules between tied input features. Best practices seem to be deep copying the modules and then aliasing the underlying weights data (Parameter.data
) between the copies.
Changes include:
- Tests updated to check for shared data pointers rather than shared module objects
- Refactored the tying code in
explain.utils
to be more accessible and work with input feature creation
Unit Test Results
6 files ±0 6 suites ±0 1h 21m 35s :stopwatch: + 9m 54s 33 tests ±0 29 :heavy_check_mark: ±0 4 :zzz: ±0 0 :x: ±0 99 runs ±0 87 :heavy_check_mark: ±0 12 :zzz: ±0 0 :x: ±0
Results for commit a3fdbdd2. ± Comparison against base commit f829345c.
:recycle: This comment has been updated with latest results.
The tests are all passing, but locally I found an IndexError
that occurs in embedding layers when using tied weights. Looking into that and will update here with a fix.
After some more testing, it looks like the IndexError
is likely an issue with sample_ratio
rather than tied weights. Setting sample_ratio: 0.1
in an agnews
config seems to cause the issue regardless of whether to two text features are tied.