Add `min` and `max` rev specializations?
From this stackoverflow post it looks like we can phrase max(a,b) into
$$ max(a,b) = \frac{a + b + |a-b|}{2} $$
and similarly for min $$ min(a,b) = \frac{a + b - |a-b|}{2} $$
So except for the case when $a=b$ this function is differentiable. For max we have
$$ \begin{align} \frac{\partial max}{a} = \frac{1}{2} + \frac{sign(a-b)}{2} \frac{\partial max}{a} = \frac{1}{2} + \frac{sign(a-b)}{2} \end{align} $$
so if a>b the gradient is 1 and if a<b the gradient is 1/2. For min we have
$$ \begin{align} \frac{\partial min}{a} = \frac{1}{2} - \frac{sign(a-b)}{2} \frac{\partial min}{a} = \frac{1}{2} - \frac{sign(a-b)}{2} \end{align} $$
where if $a>b$ the gradient is 0 and if $a<b$ the gradient is 1/2
From this it seems like pytorch does something similar to the above. I think this would be nice to implement. We can also write min(vector) for these as well.