what if the custom layer's input is five dimensional
like this [1, 1, 3, 64, 64]
how to write the shader when
texture2d_array<half, access::read> inTexture [[texture(0)]],
uses texture2d or is there a texture5d ?
All CoreML tensors are 5-dimensional, but I'm not really sure what happens when the first two dimensions (batch size and sequence length) are not actually 1.
if the first two dimensions are actually 1x1, i don't know how to write a GPU version custom layer, suppose input dimension is [1, 1, 3, 64, 64], how to write the GPU version custom layer as the texture is 2 dimensional ?
what is the input dimension is [1, 1, 128, 1024, 1024] where the third dimension is 128, not 3
The texture is actually a texture_array, which has multiple texture slices. Each slice has 4 channels.
so i can write code like this if input texture is [1, 1, 128, 1024, 1024] and output is [1, 1, 128, 1024, 1024]
`kernel void slice( texture2d_array<half, access::read> inTexture [[texture(0)]], texture2d_array<half, access::read> inTexture1 [[texture(1)]], texture2d_array<half, access::read> inTexture2 [[texture(2)]], texture2d_array<half, access::read> inTexture3 [[texture(3)]], texture2d_array<half, access::read> inTexture4 [[texture(4)]],
texture2d_array<half, access::write> outTexture [[texture(5)]],
texture2d_array<half, access::write> outTexture1 [[texture(6)]],
texture2d_array<half, access::write> outTexture2 [[texture(7)]],
texture2d_array<half, access::write> outTexture3 [[texture(8)]],
texture2d_array<half, access::write> outTexture4 [[texture(9)]],
ushort3 gid [[thread_position_in_grid]])
{`
No, you just need the one texture2d_array<half, access::read> inTexture [[texture(0)]]. Notice that its type is texture2d_array, not texture2d. When you read a pixel, you also specify the slice to read from: const auto pixel = inTexture.read(gid.xy, gid.z) where gid.z refers to the slice index.
Thanks for your help
if input is [1, 1, 128, 1024, 1024], so in texture2d_array<half, access::read> inTexture [[texture(0)]],
inTexture is of size [128, 1024, 1024], x is to 1024, y is to 1024, z is to 128 ?
Almost. Z would go from 0 to 32, since you need to divide the number of channels by 4 (because each texture slice contains 4 channels).