Cydral comments

Results 29 comments of


                                            Cydral

Implementation of a new softmax version with PLANE_WISE mode support

> Sorry I took so long to come back to this. Been a busy few weeks :| > > Anyway, I just pulled this and tried to run the unit...

Implementation of a new softmax version with PLANE_WISE mode support

I'm cancelling this PR because I've merged all the changes with the other PR containing the new definitions for multm_prev.

Add multm_prev_ layer and enhance gemm() function for PLANE_WISE operations

We can change the name without any problem. I am already dealing with compilation issues likely due to static uses of mult_prev_ (in the template part), and we will decide...

Add multm_prev_ layer and enhance gemm() function for PLANE_WISE operations

On the other hand, I was thinking of using the same convention for the transformation to be applied to softmax and thus have a special layer named softmaxm. So we...

Add multm_prev_ layer and enhance gemm() function for PLANE_WISE operations

It would be simpler to use for some people, but we would lose the flexibility to build attention in a potentially specific way (even though it currently follows fairly standard...

Add multm_prev_ layer and enhance gemm() function for PLANE_WISE operations

Indeed, I can add a high-level declaration in the layer definition file, similar to what was done for the inception layer, like : ``` template using attention_ = (...) ```

Add multm_prev_ layer and enhance gemm() function for PLANE_WISE operations

@davis, no you can do the merging. I think the conflicts with the master come from the fact that I created several branches from my own Dlib fork to be...

Add multm_prev_ layer and enhance gemm() function for PLANE_WISE operations

@Davis, could you please review this PR?

Add multm_prev_ layer and enhance gemm() function for PLANE_WISE operations

@Davis, I think I'm on the right track now (a lot of difficulty to find an enum shared by all the classes and accessible from the CPU and CUDA codes...

Add multm_prev_ layer and enhance gemm() function for PLANE_WISE operations

> @davis, I think I'm on the right track now (a lot of difficulty to find an enum shared by all the classes and accessible from the CPU and CUDA...