tkDNN
tkDNN copied to clipboard
Problem in reorg layer
Using this network configuration the output of reorg layer using CUDNN and TensorRT is different from the exported outputs of darknet using CPU:
== OUTPUT 28 CHECK RESULTS ==
CUDNN vs correct
| [ 53 ]: -0.0713108 0.714457
| [ 54 ]: -0.0889071 1.41847
| [ 55 ]: -0.0889071 1.41847
| [ 56 ]: -0.0889071 1.41847
| [ 57 ]: -0.0889071 1.41847
| [ 58 ]: -0.0889071 1.41847
| [ 59 ]: -0.0889071 1.41847
| [ 60 ]: -0.0889071 1.41847
| [ 61 ]: -0.0889071 1.41847
| [ 62 ]: -0.0889071 1.41847
| [ 63 ]: -0.0889071 1.41847
| [ 64 ]: -0.0889071 1.41847
| [ 65 ]: -0.0889071 1.41847
| [ 66 ]: -0.0889071 1.41847
| [ 67 ]: -0.0889071 1.41847
| [ 68 ]: -0.0889071 1.41847
| [ 69 ]: -0.0889071 1.41847
| [ 70 ]: -0.0889071 1.41847
| [ 71 ]: -0.0889071 1.41847
| [ 72 ]: -0.0889071 1.41847
| [ 73 ]: -0.0889071 1.41847
| [ 74 ]: -0.0889071 1.41847
| [ 75 ]: -0.0889071 1.41847
| [ 76 ]: -0.0889071 1.41847
| [ 77 ]: -0.0889071 1.41847
| [ 78 ]: -0.0889071 1.41847
| [ 79 ]: -0.0889071 1.41847
| [ 80 ]: -0.0889071 1.41847
| [ 81 ]: -0.0889071 1.41847
| [ 82 ]: -0.0889071 1.41847
| [ 83 ]: -0.0889071 1.41847
| [ 84 ]: -0.0889071 1.41847
| [ 85 ]: -0.0889071 1.41847
| [ 86 ]: -0.0889071 1.41847
| [ 87 ]: -0.0889071 1.41847
| [ 88 ]: -0.0889071 1.41847
| [ 89 ]: -0.0889071 1.41847
| [ 90 ]: -0.0889071 1.41847
| [ 91 ]: -0.0889071 1.41847
| [ 92 ]: -0.0889071 1.41847
| [ 93 ]: -0.0889071 1.41847
| [ 94 ]: -0.0889071 1.41847
| [ 95 ]: -0.0889071 1.41847
| [ 96 ]: -0.0889071 1.41847
| [ 97 ]: -0.0889071 1.41847
| [ 98 ]: -0.0889071 1.41847
| [ 99 ]: -0.0889071 1.41847
| [ 100 ]: -0.0889071 1.41847
| [ 101 ]: -0.0889071 1.41847
| [ 102 ]: -0.0889071 1.41847
| [ 103 ]: -0.27399 -0.0770894
| [ 260 ]: -0.18731 -0.12787
| [ 264 ]: -0.0292956 0.0225434
| [ 268 ]: -0.0292956 0.0897532
| [ 273 ]: -0.0292956 0.0482006
| [ 274 ]: -0.0292956 0.136003
| [ 275 ]: -0.0292956 0.102949
| [ 280 ]: -0.0292956 0.0694339
| [ 282 ]: -0.0292956 0.0455105
| [ 283 ]: -0.0292956 0.209425
| [ 292 ]: -0.0292956 0.135097
| [ 300 ]: -0.0292956 0.0492267
| [ 302 ]: -0.0292956 0.0348723
| [ 304 ]: -0.0292956 0.0715765
| [ 307 ]: -0.0292956 0.0477364
| [ 308 ]: -0.0292956 0.0468393
| [ 311 ]: -0.145768 0.551812
| [ 364 ]: -0.032352 0.33807
| [ 365 ]: -0.17146 -0.0114406
| [ 366 ]: -0.37652 -0.131791
| [ 368 ]: -0.182342 -0.0839855
| [ 369 ]: -0.252877 -0.14398
| [ 370 ]: -0.314679 -0.389765
| [ 371 ]: -0.513241 -0.449211
| [ 372 ]: -0.565379 -0.211569
| [ 373 ]: -0.181062 -0.368287
| [ 374 ]: -0.300863 -0.0671517
| [ 376 ]: -0.398152 -0.0442645
| [ 377 ]: -0.416887 -0.143037
| [ 378 ]: -0.38611 -0.1623
| [ 379 ]: -0.347269 -0.0211162
| [ 381 ]: -0.484706 -0.14062
| [ 384 ]: -0.492647 -0.392213
| [ 385 ]: -0.161193 -0.0696261
| [ 386 ]: -0.51637 -0.0814965
| [ 387 ]: -0.491412 -0.106202
| [ 388 ]: -0.335896 -0.194181
| [ 390 ]: -0.289178 -0.181675
| [ 393 ]: -0.256528 -0.0414582
| [ 394 ]: -0.7003 -0.200466
| [ 395 ]: -0.517625 -0.145905
| [ 396 ]: -0.320515 1.22699
| [ 397 ]: -0.41259 -0.301313
| [ 398 ]: -0.362108 -0.106931
| [ 399 ]: -0.218155 -0.16281
| [ 400 ]: -0.20313 -0.152428
| [ 402 ]: -0.414642 -0.0900799
| [ 403 ]: -0.607981 -0.475988
| [ 404 ]: -0.548641 -0.11561
| Wrongs: 57030 ~0.05
I have checked the code and noticed that it is the exact code in the darknet project for reorg layer on GPU. Could it be a bug in darknet code? Or there is something I am missing?
It should be noted that I have tested the provided network in darknet using both CPU and GPU many times and the final results are similar. But I have not compared all the elements of the output of reorg layer between CPU and GPU as in tkDNN.
P.S. I used my fork of this repo which is mentioned in #47.
Are you sure the error is in the reorg? If you end the network right before the reorg the result is correct? And right after? Are the input and output dimensions of reorg the same as darknet?
Are you sure the error is in the reorg? If you end the network right before the reorg the result is correct? And right after? Are the input and output dimensions of reorg the same as darknet?
I used the debug folder and checked the output of all the layers. Everything is correct until the reorg layer.
This output is similar to darknet so the dimensions must be correct:
====================== NETWORK MODEL ======================
N. Layer type input (H*W,CH) output (H*W,CH)
0 Conv2d 416 x 416, 1 -> 416 x 416, 16
1 ActivationLeaky 416 x 416, 16 -> 416 x 416, 16
2 Conv2d 416 x 416, 16 -> 208 x 208, 32
3 ActivationLeaky 208 x 208, 32 -> 208 x 208, 32
4 Conv2d 208 x 208, 32 -> 208 x 208, 16
5 ActivationLeaky 208 x 208, 16 -> 208 x 208, 16
6 Conv2d 208 x 208, 16 -> 208 x 208, 32
7 ActivationLeaky 208 x 208, 32 -> 208 x 208, 32
8 Conv2d 208 x 208, 32 -> 104 x 104, 64
9 ActivationLeaky 104 x 104, 64 -> 104 x 104, 64
10 Conv2d 104 x 104, 64 -> 104 x 104, 32
11 ActivationLeaky 104 x 104, 32 -> 104 x 104, 32
12 Conv2d 104 x 104, 32 -> 104 x 104, 64
13 ActivationLeaky 104 x 104, 64 -> 104 x 104, 64
14 Conv2d 104 x 104, 64 -> 104 x 104, 32
15 ActivationLeaky 104 x 104, 32 -> 104 x 104, 32
16 Conv2d 104 x 104, 32 -> 104 x 104, 64
17 ActivationLeaky 104 x 104, 64 -> 104 x 104, 64
18 Conv2d 104 x 104, 64 -> 104 x 104, 32
19 ActivationLeaky 104 x 104, 32 -> 104 x 104, 32
20 Conv2d 104 x 104, 32 -> 104 x 104, 64
21 ActivationLeaky 104 x 104, 64 -> 104 x 104, 64
22 Conv2d 104 x 104, 64 -> 104 x 104, 32
23 ActivationLeaky 104 x 104, 32 -> 104 x 104, 32
24 Route 104 x 104, 64 -> 104 x 104, 64
25 Conv2d 104 x 104, 64 -> 104 x 104, 32
26 ActivationLeaky 104 x 104, 32 -> 104 x 104, 32
27 Conv2d 104 x 104, 32 -> 104 x 104, 64
28 ActivationLeaky 104 x 104, 64 -> 104 x 104, 64
29 Conv2d 104 x 104, 64 -> 104 x 104, 32
30 ActivationLeaky 104 x 104, 32 -> 104 x 104, 32
31 Route 104 x 104, 64 -> 104 x 104, 64
32 Conv2d 104 x 104, 64 -> 104 x 104, 32
33 ActivationLeaky 104 x 104, 32 -> 104 x 104, 32
34 Route 104 x 104, 64 -> 104 x 104, 64
35 Conv2d 104 x 104, 64 -> 104 x 104, 32
36 ActivationLeaky 104 x 104, 32 -> 104 x 104, 32
37 Route 104 x 104, 64 -> 104 x 104, 64
38 Conv2d 104 x 104, 64 -> 52 x 52, 128
39 ActivationLeaky 52 x 52, 128 -> 52 x 52, 128
40 Conv2d 52 x 52, 128 -> 52 x 52, 64
41 ActivationLeaky 52 x 52, 64 -> 52 x 52, 64
42 Conv2d 52 x 52, 64 -> 52 x 52, 128
43 ActivationLeaky 52 x 52, 128 -> 52 x 52, 128
44 Conv2d 52 x 52, 128 -> 52 x 52, 64
45 ActivationLeaky 52 x 52, 64 -> 52 x 52, 64
46 Conv2d 52 x 52, 64 -> 52 x 52, 128
47 ActivationLeaky 52 x 52, 128 -> 52 x 52, 128
48 Route 104 x 104, 64 -> 104 x 104, 64
49 Conv2d 104 x 104, 64 -> 104 x 104, 16
50 ActivationLeaky 104 x 104, 16 -> 104 x 104, 16
51 Reorg 104 x 104, 16 -> 52 x 52, 64
52 Route 52 x 52, 192 -> 52 x 52, 192
53 Conv2d 52 x 52, 192 -> 52 x 52, 128
54 ActivationLeaky 52 x 52, 128 -> 52 x 52, 128
55 Conv2d 52 x 52, 128 -> 52 x 52, 256
56 ActivationLeaky 52 x 52, 256 -> 52 x 52, 256
57 Conv2d 52 x 52, 256 -> 52 x 52, 18
58 Yolo 52 x 52, 18 -> 52 x 52, 18
59 Route 208 x 208, 32 -> 208 x 208, 32
60 Pooling 208 x 208, 32 -> 104 x 104, 32
61 Conv2d 104 x 104, 32 -> 104 x 104, 64
62 ActivationLeaky 104 x 104, 64 -> 104 x 104, 64
63 Pooling 104 x 104, 64 -> 52 x 52, 64
64 Conv2d 52 x 52, 64 -> 52 x 52, 128
65 ActivationLeaky 52 x 52, 128 -> 52 x 52, 128
66 Pooling 52 x 52, 128 -> 26 x 26, 128
67 Conv2d 26 x 26, 128 -> 26 x 26, 256
68 ActivationLeaky 26 x 26, 256 -> 26 x 26, 256
69 Pooling 26 x 26, 256 -> 13 x 13, 256
70 Conv2d 13 x 13, 256 -> 13 x 13, 512
71 ActivationLeaky 13 x 13, 512 -> 13 x 13, 512
72 Pooling 13 x 13, 512 -> 13 x 13, 512
73 Conv2d 13 x 13, 512 -> 13 x 13, 18
74 Yolo 13 x 13, 18 -> 13 x 13, 18
===========================================================
I found the problem.
Input arguments of the reorg_kernel()
are output dimensions of the result. Therefore the arguments of reorgForward()
are incorrect and should be changed like this:
reorgForward(srcData, dstData, output_dim.n, output_dim.c, output_dim.h, output_dim.w, stride);
Similarly ReorgRT::configure()
needs to be changed:
void configure(const Dims* inputDims, int nbInputs, const Dims* outputDims, int nbOutputs, int maxBatchSize) override {
c = outputDims[0].d[0];
h = outputDims[0].d[1];
w = outputDims[0].d[2];
}
Pull request #47, now contains this fix.
Reorg Is a layer of the old version of YOLOs, in that time I was referring to the original darknet implementation, not the Alexey one. In the original reorg layer it uses the input dimensions: https://github.com/pjreddie/darknet/blob/master/src/reorg_layer.c
This layer in the Alexey impl is called reorg_old_layer: https://github.com/AlexeyAB/darknet/blob/master/src/reorg_old_layer.c
I have to check if this layer modification works even for the old YOLOs and then I will approve your changes. Anyway thank you, I was missing this layer switch.
You are right. I didn't noticed this. Actually my configuration contains a reorg3d
layer. I added reorg3d
just as an alias for the reorg
layer. Therefore the problem occurred. My fix will definitely affect older YOLO versions. May be add a separate reorg3d
layer like Alexey?
Or we can add an option to the reorg, since the code is pretty much the same
Ok. I did the fix.