ORN icon indicating copy to clipboard operation
ORN copied to clipboard

the meaning of nOrientation and nRotation

Open lychahaha opened this issue 6 years ago • 6 comments

according to my understanding, nOrientation means the dim of each point(unit) in the feature map or kernel, and nRotation means the copy-rotated number of the kernel. Points in the feature map or kernel, are not always a scalar but a vector, or a n-dim point as your paper say.

So, ORConv2d(1,10,arf_config=(1,8), kernel_size=3) means input is with 1 channel, where points are scalars, and conv_kernel is with 1 in_ch, 10 out_ch, 8 rotated copy, where points are scalars, too. ORConv2d(10,20,arf_config=8, kernel_size=3) means input is with 10 channel, where points are 8-dim vector, and conv_kernel is with 10 in_ch, 20 out_ch, 8 rotated copy, where points are 8-dim vector, too.

As a word, nOrientation is the dim of points in input, while nRotation is the dim of points in output.

Is it right?

lychahaha avatar Apr 21 '18 07:04 lychahaha

Yes. For an ORConv operation, the size of inputs is [nBatch x nInputChannel x nOrientation x H x W], the size of ARFs is [nOutputChannel x nInputChannel x nOrientation x kH x kW], and the size of outputs is [nBatch x nOutputChannel x nRotation x H x W].

ZhouYanzhao avatar Apr 21 '18 12:04 ZhouYanzhao

Okey,thanks. By the way, something in the paper confuses me.

  1. Orientation spin. How is a n-dim point(vector) spined? What's the rotation axis?(what is the definition of α of F'θpq(α) in the paper?)
  2. The paper says F'θ,pq is a sample of function F'θ,pq(α), and F'θ,pq(α) is a periodic function. But what do the rest N-1 F'θ,pq(x) means? It doesn't seem they are some P_dst which is rotated by some P_src.

lychahaha avatar Apr 21 '18 13:04 lychahaha

@ljhandlwt , In ORN, feature maps and filters (ARFs) are vector fields that explicitly encode orientation information. Coordinate Rotation and Orientation Spin are steps in rotating a vector field. Here is a simple illustration: how to rotate ARFs For more details, please check Sec 3.1 in the paper.

ZhouYanzhao avatar Apr 24 '18 02:04 ZhouYanzhao

@ZhouYanzhao , According to Sec3.1 and Fig2, an ARF has a shape of [W,W,N]. So an arrow in Fig2.ARF or your picture means a number, is it right?

lychahaha avatar Apr 24 '18 03:04 lychahaha

@ljhandlwt ARFs are viewed as N-directional points on a grid (vector fields). For each arrow, the length represents its value (a number), and the angle indicates the corresponding orientation channel.

ZhouYanzhao avatar Apr 25 '18 07:04 ZhouYanzhao

@ZhouYanzhao oh I guess I understand your words eventually. "N-directional points" means an activation value at (x,y), isn't it? The activation value is a scalar for canonical conv filters, but now it's an N-dimensional vector, am I right?

askerlee avatar Jun 17 '19 12:06 askerlee