mace icon indicating copy to clipboard operation
mace copied to clipboard

ArgMax operation (GPU Implementation)

Open vanmaxim opened this issue 5 years ago • 4 comments

vanmaxim avatar Jul 23 '19 15:07 vanmaxim

@nolanliou @llhe Hi. Thanks for your amazing work! I need some help. I don't get how to make an output tensor with int type? Also how to make an output tensor with three dimensions (like for CPU implementation)?

vanmaxim avatar Jul 23 '19 15:07 vanmaxim

@vanmaxim Thanks for your contributions!

  • Could you show that why you need argmax operation on GPU? Is there some shortage in the CPU implementation?
  • OpenCL is not good at controlflow statements, such as if-else, the same in CUDA. So the running may cost more time than CPU.
  • We haven't support 3-D memory layout for OpenCL Image struct, refer to: https://mace.readthedocs.io/en/latest/development/memory_layout.html. It seems to be hard for mapping dimensions in tensors to Images.

yejw5 avatar Jul 24 '19 07:07 yejw5

@yejw5 How can I make output tensor with int type? PS. I'll prepare performance benchmarks.

vanmaxim avatar Jul 25 '19 16:07 vanmaxim

@vanmaxim Did you mean model's output in MaceTensor::data()? Currently, only float data type is supported. But you can reinterpret_cast the data pointer as int type.

yejw5 avatar Jul 26 '19 01:07 yejw5