rfcs icon indicating copy to clipboard operation
rfcs copied to clipboard

RFC-0016: Masked reductions and normalizations

Open cpuhrsch opened this issue 4 years ago • 8 comments

This RFC Discusses semantics and implementation details of masked reduction and normalization operators.

Rendered

cpuhrsch avatar Aug 26 '21 19:08 cpuhrsch

cc @IvanYashchuk @pearu @ngimel @mruberry @ezyang @jbschlosser

cpuhrsch avatar Aug 26 '21 19:08 cpuhrsch

Feels like it should be prototypable with __torch_function__ (or maybe __torch_dispatch__?)

ezyang avatar Aug 27 '21 03:08 ezyang

@ezyang - does this mean you'd prefer to see this released and packaged out of tree first before considering inclusion in the core?

cpuhrsch avatar Aug 27 '21 17:08 cpuhrsch

does this mean you'd prefer to see this released and packaged out of tree first before considering inclusion in the core?

Not necessarily; I'm referring to this part of the spec:

Indeed the best way to describe the behavior is to implement it. Please note that this is only meant to describe semantics and is not an actual implementation.

wouldn't be a long step to have an executable specification that people can play around with.

ezyang avatar Aug 30 '21 03:08 ezyang

Since the nan* reductions, like nansum, are existing masked reductions we should be sure the semantics are equivalent. This proposal just allows the mask to be specified directly rather than by value. Supporting more general value-based masking might be interesting in the future, too.

cc @heitorschueroff

mruberry avatar Aug 30 '21 03:08 mruberry

@ezyang - agreed, I'm wondering whether or when we should create an out-of-tree Python-only prototype for a MaskedTensor.

@mruberry - you can always get a value-based (let's say 4) mask by e.g. masked_sum(input, input != 4) or masked_sum(input, input == input) for nan.

cpuhrsch avatar Aug 30 '21 13:08 cpuhrsch

masked_sum(input, ~(input != input)) for nan.

Nit: masked_sum(input, input == input) would work for the nan case as well.

pearu avatar Aug 30 '21 14:08 pearu

agreed, I'm wondering whether or when we should create an out-of-tree Python-only prototype for a MaskedTensor.

If it's just one person, probably sticking it in a colab is good enough. Multiple people wanting to work on the semantics ~> put it in GitHub somewhere.

ezyang avatar Aug 30 '21 16:08 ezyang