blaze_cuda
blaze_cuda copied to clipboard
Partial evaluation: Adapt more expressions
Most of the work is done now, cudaAssign() needs overloads for every expression to support partial evaluation properly, following the same implementation pattern as in DMatDMatAddExpr.h:
- External to the original expression templates
- Implement the same functionalities as their CPU counterparts
- Follow the same enable condition as their CPU counterparts
-
Call
cudaAssign()instead ofassign()
cuBLAS will be used as much as possible to implement them.
Here's a list of the expressions that are already implemented:
DMatDMatMapExpr
DMatMapExpr
DMatDMatSubExpr
DMatDMatAddExpr
DMatDMatMultExpr
DMatSerialExpr
DMatTransExpr
Here's a list of expressions being worked on:
DMatDVecMultExpr:
Requires a bit of work on the cuBLAS part
DVecDVecInnerExpr:
Only plain vectors, requires modifications on Blaze to work seamlessly with views
& CUDA-compatible expressions
And for starters, here's a list of expressions to implement:
DMatDeclLowExpr
DMatScalarDivExpr
TVecMatMultExpr
DVecTransExpr
DVecDVecSubExpr
DVecDVecOuterExpr
DMatDMatKronExpr
DMatDeclSymExpr
DMatDMatEqualExpr
DMatDMatSchurExpr
DVecSoftmaxExpr
TDMatTDMatMultExpr
DMatDeclDiagExpr
DVecDVecMapExpr
DMatNormExpr
DMatMeanExpr
DMatDeclHermExpr
DMatTDMatMapExpr
DMatTDMatMultExpr
DMatTDMatSchurExpr
DVecSerialExpr
DVecDVecAddExpr
DVecDVecCrossExpr
DVecScalarDivExpr
DVecDVecKronExpr
DMatSoftmaxExpr
DVecVarExpr
TDMatDVecMultExpr
DVecEvalExpr
TDVecTDMatMultExpr
DVecNormExpr
DVecReduceExpr
DVecExpandExpr
DVecDVecMultExpr
DVecMapExpr
DVecDVecDivExpr
DMatTDMatAddExpr
DMatTDMatSubExpr
TDMatDMatMultExpr
DMatDetExpr
DMatScalarMultExpr
DMatDeclUppExpr
DVecScalarMultExpr
DMatEvalExpr
DMatInvExpr
DMatReduceExpr
DVecDVecEqualExpr
DMatStdDevExpr
DMatVarExpr
DVecMeanExpr
DVecStdDevExpr
TDVecDMatMultExpr