d2l-zh icon indicating copy to clipboard operation
d2l-zh copied to clipboard

第33行中关于锚框宽度的论述需要修改。

Open ClancyCC opened this issue 2 years ago • 2 comments

结论是宽度应为$hs*\sqrt{r}$,而非$ws*\sqrt{r}$。

建议查看代码,multibox_prior函数中定义了$w=s*\sqrt{r}\frac{h}{w}$,$h=s/\sqrt{r}$,实际使用show_bboxes函数的时候乘上了bbox_scale=torch.tensor((w, h, w, h))。也就是说,实际生成的锚框的宽度应为$s\sqrt{r}\frac{h}{w}w=hs\sqrt{r}$,高度为$hs/sqrt(r)$,锚框的实际宽高比就为二者相除,即r。

希望可以帮助大家理解!这一节的难度很大,但是在此出错会在很大程度上阻碍大家理解!

ClancyCC avatar Aug 12 '22 07:08 ClancyCC

Job d2l-zh/PR-1190/1 is complete. Check the results at http://preview.d2l.ai/d2l-zh/PR-1190/

d2l-bot avatar Aug 12 '22 07:08 d2l-bot

Job d2l-zh/PR-1190/2 is complete. Check the results at http://preview.d2l.ai/d2l-zh/PR-1190/

d2l-bot avatar Aug 12 '22 07:08 d2l-bot

Thanks. As a side note:

h, w = img.shape[:2]

for i in range(5):
    t = boxes[250, 250, i, :]
    upleft_x, upleft_y, loright_x, loright_y = t[0], t[1], t[2], t[3]
    width = (loright_x - upleft_x)*w
    height = (loright_y - upleft_y)*h
    print(width, height, width/height)
    
ratios=[1, 2, 0.5]
print(h, w)

output

tensor(420.75) tensor(420.75) tensor(1.)
tensor(280.50) tensor(280.50) tensor(1.)
tensor(140.25) tensor(140.25) tensor(1.00)
tensor(595.03) tensor(297.52) tensor(2.)
tensor(297.52) tensor(595.03) tensor(0.50)
561 728

astonzhang avatar Dec 04 '22 06:12 astonzhang