Refactor use_padding from a global shape inference setting to a field of the `Conv_input` constructor
Good first step on the road to padding inference.
@claude-opus please refactor use_padding from a global shape inference setting to a field of the Conv_input constructor. Localize the choices based on use_padding to now use the field of the corresponding Conv_input -- if there is no Conv_input to deconstruct in the context of a use-case, point out such places in your report.
Claude finished @lukstafi's task —— View job • claude/issue-386-20250918-1051 • Create PR ➔
Refactoring use_padding to Conv_input field ✅
Completed Tasks:
- [x] Search for current use_padding implementation
- [x] Locate Conv_input constructor definition
- [x] Add use_padding field to Conv_input in both dim and proj types
- [x] Update all Conv_input pattern matches in row.ml for both types
- [x] Update all Conv_input constructor calls to include use_padding
- [x] Build and verify changes compile successfully
- [x] Commit and push changes
Summary of Changes
Successfully refactored use_padding from a global shape inference setting to a field of the Conv_input constructor. The changes include:
-
Added
use_padding: boolfield to both:- The public
dimtypeConv_inputconstructor inrow.mlandrow.mli - The internal
projtypeConv_inputconstructor inrow.ml
- The public
-
Updated all pattern matches and constructor calls to include the
use_paddingfield acrossrow.ml -
Kept the global
use_paddingvariable for now because:shape.mlcreatesConv_inputduring einsum spec parsing- The
use_paddingvalue comes from theconv2dfunction parameter - Parsing happens independently of the conv2d call
- Current workaround:
nn_blocks.mlsets the global before einsum operations
Places without Conv_input context
The main location where we cannot directly use the Conv_input field is in shape.ml lines 420 and 1798 where Conv_input is constructed during einsum specification parsing. These locations currently use !Row.use_padding to get the value from the global reference.
The hack in nn_blocks.ml line 175 ([%oc Ocannl_tensor.Row.use_padding := use_padding]) is still necessary to pass the use_padding parameter to the shape inference system.
Next Steps
To fully eliminate the global variable would require one of:
- Extending einsum specification syntax to include padding information
- Adding a new API to configure padding at the operation level
- Refactoring the shape inference system to thread padding context through
For now, the refactoring successfully localizes use_padding to the Conv_input constructor, making the dependency explicit even though the global is still needed as a communication mechanism.
|
This is fixed now.