mace ArgMax operation (GPU Implementation)

ArgMax operation (GPU Implementation)

Open vanmaxim opened this issue 5 years ago • 4 comments

Jul 23 '19 15:07 vanmaxim

@nolanliou @llhe Hi. Thanks for your amazing work! I need some help. I don't get how to make an output tensor with int type? Also how to make an output tensor with three dimensions (like for CPU implementation)?

Jul 23 '19 15:07 vanmaxim

@vanmaxim Thanks for your contributions!

Could you show that why you need argmax operation on GPU? Is there some shortage in the CPU implementation?
OpenCL is not good at controlflow statements, such as if-else, the same in CUDA. So the running may cost more time than CPU.
We haven't support 3-D memory layout for OpenCL Image struct, refer to: https://mace.readthedocs.io/en/latest/development/memory_layout.html. It seems to be hard for mapping dimensions in tensors to Images.

Jul 24 '19 07:07 yejw5

@yejw5 How can I make output tensor with int type? PS. I'll prepare performance benchmarks.

Jul 25 '19 16:07 vanmaxim

@vanmaxim Did you mean model's output in MaceTensor::data()？ Currently, only float data type is supported. But you can reinterpret_cast the data pointer as int type.

Jul 26 '19 01:07 yejw5

mace mace copied to clipboard

ArgMax operation (GPU Implementation)

mace
mace copied to clipboard