ludwig fix: Copy modules and alias weights data when using tied weights

fix: Copy modules and alias weights data when using tied weights

Open jeffkinnison opened this issue 1 year ago • 3 comments

The current tied weights implementation aliases torch modules between tied input features. Best practices seem to be deep copying the modules and then aliasing the underlying weights data (Parameter.data) between the copies.

Changes include:

Tests updated to check for shared data pointers rather than shared module objects
Refactored the tying code in explain.utils to be more accessible and work with input feature creation

May 18 '23 21:05 jeffkinnison

Unit Test Results

  6 files ±0   6 suites ±0 1h 21m 35s :stopwatch: + 9m 54s 33 tests ±0 29 :heavy_check_mark: ±0   4 :zzz: ±0 0 :x: ±0 99 runs ±0 87 :heavy_check_mark: ±0 12 :zzz: ±0 0 :x: ±0

Results for commit a3fdbdd2. ± Comparison against base commit f829345c.

:recycle: This comment has been updated with latest results.

May 18 '23 23:05 github-actions[bot]

The tests are all passing, but locally I found an IndexError that occurs in embedding layers when using tied weights. Looking into that and will update here with a fix.

May 19 '23 15:05 jeffkinnison

After some more testing, it looks like the IndexError is likely an issue with sample_ratio rather than tied weights. Setting sample_ratio: 0.1 in an agnews config seems to cause the issue regardless of whether to two text features are tied.

May 19 '23 16:05 jeffkinnison

ludwig ludwig copied to clipboard

fix: Copy modules and alias weights data when using tied weights

Unit Test Results

ludwig
ludwig copied to clipboard