cgt
cgt copied to clipboard
Elementwise op with arbitrary arity
It could be useful to unify and generalize the elementwise unary and binary ops to support arbitrary numbers of arguments. This would allow for some nice graph simplifications, and in particular, compositions of elementwise operations (for example, for common implementations of ReLU) would be performed with a single CUDA kernel launch. A general elementwise arithmetic op could store a symbolic expression that could even be simplified with a library like SymPy.
Agreed. As in theano, we could have an elementwise composite operation and an optimization that fuses elementwise operations.