mace
mace copied to clipboard
ArgMax operation (GPU Implementation)
@nolanliou @llhe Hi. Thanks for your amazing work!
I need some help. I don't get how to make an output tensor with int
type? Also how to make an output tensor with three dimensions (like for CPU implementation)?
@vanmaxim Thanks for your contributions!
- Could you show that why you need argmax operation on GPU? Is there some shortage in the CPU implementation?
- OpenCL is not good at controlflow statements, such as if-else, the same in CUDA. So the running may cost more time than CPU.
- We haven't support 3-D memory layout for OpenCL Image struct, refer to: https://mace.readthedocs.io/en/latest/development/memory_layout.html. It seems to be hard for mapping dimensions in tensors to Images.
@yejw5 How can I make output tensor with int
type?
PS. I'll prepare performance benchmarks.
@vanmaxim Did you mean model's output in MaceTensor::data()
? Currently, only float
data type is supported. But you can reinterpret_cast
the data pointer as int
type.