composer icon indicating copy to clipboard operation
composer copied to clipboard

Change functional surgery method return values to Layers Modified

Open nik-mosaic opened this issue 2 years ago • 6 comments

Closes https://mosaicml.atlassian.net/browse/CO-988 and #1441 . For consistency sake, and to ensure that the user knows the model is being mutated, all applicable functional model surgery methods return # of modified layers (as opposed to None or the model).

nik-mosaic avatar Sep 20 '22 09:09 nik-mosaic

Can we instead return how many layers are changed? Would be very useful for agent bc then it can skip surgeries if they are no-ops

mvpatel2000 avatar Sep 20 '22 14:09 mvpatel2000

This is an option. However, there are lots of algorithms which aren't surgery based, for which there is no clear number to return. e.g.: compute_ema. One priority should be consistency for users using the functional interface. Returning non-none will reduce this.

We currently raise NoEffectWarnings for no-op model surgery algorithms; the agent could catch those warnings.

nik-mosaic avatar Sep 20 '22 17:09 nik-mosaic

Agreed with @nik-mosaic , having the methods return an int would lead to odd bugs when users try:

model = apply_x(model)

whereas None would clearly indicate that this is an in-place operation.

hanlint avatar Sep 20 '22 18:09 hanlint

IMO it's ok to return "something" for apply methods if it makes sense (e.g., number of layers replaced as @mvpatel2000 suggested) but not model if apply method is modifying the model inplace. Any user of apply_* method will look at least look at the API to see what it does and will be able to see return type/value. Therefore, consistency across the board for the apply APIs is, I think, too restrictive.

dskhudia avatar Sep 20 '22 19:09 dskhudia

Agreed with @nik-mosaic , having the methods return an int would lead to odd bugs when users try:

model = apply_x(model)

whereas None would clearly indicate that this is an in-place operation.

I think returning an int would pretty clearly indicate what's going on here, and the docs can be clear this is in-place. It's a lot cleaner to just measure how many layers are changed vs. trying to catch warnings and interpret them.

Basically +1 to Daya's points

mvpatel2000 avatar Sep 21 '22 18:09 mvpatel2000

Per offline discussion, functional model surgery methods will return # of modified layers where applicable and None elsewhere.

nik-mosaic avatar Sep 21 '22 23:09 nik-mosaic

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@nik-mosaic do we close this PR?

hanlint avatar Jan 27 '23 16:01 hanlint

@nik-mosaic do we close this PR?

In a December design discussion, we backtracked on our decision to return the number of layers modified. We decided that for user-friendliness, functional algorithms should return None instead. This PR is now ready to merge.

nik-mosaic avatar Jan 30 '23 21:01 nik-mosaic