ORN
ORN copied to clipboard
the meaning of nOrientation and nRotation
according to my understanding, nOrientation means the dim of each point(unit) in the feature map or kernel, and nRotation means the copy-rotated number of the kernel. Points in the feature map or kernel, are not always a scalar but a vector, or a n-dim point as your paper say.
So, ORConv2d(1,10,arf_config=(1,8), kernel_size=3) means input is with 1 channel, where points are scalars, and conv_kernel is with 1 in_ch, 10 out_ch, 8 rotated copy, where points are scalars, too. ORConv2d(10,20,arf_config=8, kernel_size=3) means input is with 10 channel, where points are 8-dim vector, and conv_kernel is with 10 in_ch, 20 out_ch, 8 rotated copy, where points are 8-dim vector, too.
As a word, nOrientation is the dim of points in input, while nRotation is the dim of points in output.
Is it right?
Yes. For an ORConv operation, the size of inputs is [nBatch x nInputChannel x nOrientation x H x W]
, the size of ARFs is [nOutputChannel x nInputChannel x nOrientation x kH x kW]
, and the size of outputs is [nBatch x nOutputChannel x nRotation x H x W]
.
Okey,thanks. By the way, something in the paper confuses me.
- Orientation spin. How is a n-dim point(vector) spined? What's the rotation axis?(what is the definition of α of F'θpq(α) in the paper?)
- The paper says F'θ,pq is a sample of function F'θ,pq(α), and F'θ,pq(α) is a periodic function. But what do the rest N-1 F'θ,pq(x) means? It doesn't seem they are some P_dst which is rotated by some P_src.
@ljhandlwt ,
In ORN, feature maps and filters (ARFs) are vector fields that explicitly encode orientation information. Coordinate Rotation and Orientation Spin are steps in rotating a vector field. Here is a simple illustration:
For more details, please check Sec 3.1 in the paper.
@ZhouYanzhao ,
According to Sec3.1 and Fig2, an ARF has a shape of [W,W,N]
.
So an arrow in Fig2.ARF or your picture means a number, is it right?
@ljhandlwt ARFs are viewed as N-directional points on a grid (vector fields). For each arrow, the length represents its value (a number), and the angle indicates the corresponding orientation channel.
@ZhouYanzhao oh I guess I understand your words eventually. "N-directional points" means an activation value at (x,y), isn't it? The activation value is a scalar for canonical conv filters, but now it's an N-dimensional vector, am I right?