blaze_cuda icon indicating copy to clipboard operation
blaze_cuda copied to clipboard

Partial evaluation: Adapt more expressions

Open JPenuchot opened this issue 6 years ago • 1 comments

Most of the work is done now, cudaAssign() needs overloads for every expression to support partial evaluation properly, following the same implementation pattern as in DMatDMatAddExpr.h:

  • External to the original expression templates
  • Implement the same functionalities as their CPU counterparts
  • Follow the same enable condition as their CPU counterparts
  • Call cudaAssign() instead of assign()

cuBLAS will be used as much as possible to implement them.

JPenuchot avatar Jul 23 '19 15:07 JPenuchot

Here's a list of the expressions that are already implemented:

DMatDMatMapExpr
DMatMapExpr
DMatDMatSubExpr
DMatDMatAddExpr
DMatDMatMultExpr
DMatSerialExpr
DMatTransExpr

Here's a list of expressions being worked on:

DMatDVecMultExpr: 
  Requires a bit of work on the cuBLAS part
DVecDVecInnerExpr: 
  Only plain vectors, requires modifications on Blaze to work seamlessly with views
  & CUDA-compatible expressions

And for starters, here's a list of expressions to implement:

DMatDeclLowExpr
DMatScalarDivExpr
TVecMatMultExpr
DVecTransExpr
DVecDVecSubExpr
DVecDVecOuterExpr
DMatDMatKronExpr
DMatDeclSymExpr
DMatDMatEqualExpr
DMatDMatSchurExpr
DVecSoftmaxExpr
TDMatTDMatMultExpr
DMatDeclDiagExpr
DVecDVecMapExpr
DMatNormExpr
DMatMeanExpr
DMatDeclHermExpr
DMatTDMatMapExpr
DMatTDMatMultExpr
DMatTDMatSchurExpr
DVecSerialExpr
DVecDVecAddExpr
DVecDVecCrossExpr
DVecScalarDivExpr
DVecDVecKronExpr
DMatSoftmaxExpr
DVecVarExpr
TDMatDVecMultExpr
DVecEvalExpr
TDVecTDMatMultExpr
DVecNormExpr
DVecReduceExpr
DVecExpandExpr
DVecDVecMultExpr
DVecMapExpr
DVecDVecDivExpr
DMatTDMatAddExpr
DMatTDMatSubExpr
TDMatDMatMultExpr
DMatDetExpr
DMatScalarMultExpr
DMatDeclUppExpr
DVecScalarMultExpr
DMatEvalExpr
DMatInvExpr
DMatReduceExpr
DVecDVecEqualExpr
DMatStdDevExpr
DMatVarExpr
DVecMeanExpr
DVecStdDevExpr
TDVecDMatMultExpr

JPenuchot avatar Aug 14 '19 18:08 JPenuchot