darknet-nnpack
darknet-nnpack copied to clipboard
darknet-nnpack compiles but does not run as expected no results on Tiny YOLO
Running the yolo example has these results.
./darknet detector test cfg/coco.data cfg/yolo.cfg yolo.weights data/person.jpg
layer filters size input output
0 conv 32 3 x 3 / 1 608 x 608 x 3 -> 608 x 608 x 32
1 max 2 x 2 / 2 608 x 608 x 32 -> 304 x 304 x 32
2 conv 64 3 x 3 / 1 304 x 304 x 32 -> 304 x 304 x 64
3 max 2 x 2 / 2 304 x 304 x 64 -> 152 x 152 x 64
4 conv 128 3 x 3 / 1 152 x 152 x 64 -> 152 x 152 x 128
5 conv 64 1 x 1 / 1 152 x 152 x 128 -> 152 x 152 x 64
6 conv 128 3 x 3 / 1 152 x 152 x 64 -> 152 x 152 x 128
7 max 2 x 2 / 2 152 x 152 x 128 -> 76 x 76 x 128
8 conv 256 3 x 3 / 1 76 x 76 x 128 -> 76 x 76 x 256
9 conv 128 1 x 1 / 1 76 x 76 x 256 -> 76 x 76 x 128
10 conv 256 3 x 3 / 1 76 x 76 x 128 -> 76 x 76 x 256
11 max 2 x 2 / 2 76 x 76 x 256 -> 38 x 38 x 256
12 conv 512 3 x 3 / 1 38 x 38 x 256 -> 38 x 38 x 512
13 conv 256 1 x 1 / 1 38 x 38 x 512 -> 38 x 38 x 256
14 conv 512 3 x 3 / 1 38 x 38 x 256 -> 38 x 38 x 512
15 conv 256 1 x 1 / 1 38 x 38 x 512 -> 38 x 38 x 256
16 conv 512 3 x 3 / 1 38 x 38 x 256 -> 38 x 38 x 512
17 max 2 x 2 / 2 38 x 38 x 512 -> 19 x 19 x 512
18 conv 1024 3 x 3 / 1 19 x 19 x 512 -> 19 x 19 x1024
19 conv 512 1 x 1 / 1 19 x 19 x1024 -> 19 x 19 x 512
20 conv 1024 3 x 3 / 1 19 x 19 x 512 -> 19 x 19 x1024
21 conv 512 1 x 1 / 1 19 x 19 x1024 -> 19 x 19 x 512
22 conv 1024 3 x 3 / 1 19 x 19 x 512 -> 19 x 19 x1024
23 conv 1024 3 x 3 / 1 19 x 19 x1024 -> 19 x 19 x1024
24 conv 1024 3 x 3 / 1 19 x 19 x1024 -> 19 x 19 x1024
25 route 16
26 conv 64 1 x 1 / 1 38 x 38 x 512 -> 38 x 38 x 64
27 reorg / 2 38 x 38 x 64 -> 19 x 19 x 256
28 route 27 24
29 conv 1024 3 x 3 / 1 19 x 19 x1280 -> 19 x 19 x1024
30 conv 425 1 x 1 / 1 19 x 19 x1024 -> 19 x 19 x 425
31 detection
Loading weights from yolo.weights...Done!
data/person.jpg: Predicted in 15409 ms.
person: 86%
horse: 82%
dog: 86%
Which says 15.6 but I'm running this on on a RPi3 so technically it should run in 8.2 sec but takes 154 secs.
The bigger problem comes when I run the tiny_Yolo on nnpack. As it outputs no dertections at all no matter which image I use. And it is also slow.
./darknet detector test cfg/voc.data cfg/tiny-yolo.cfg tiny-yolo.weights data/person.jpg
layer filters size input output
0 conv 16 3 x 3 / 1 416 x 416 x 3 -> 416 x 416 x 16
1 max 2 x 2 / 2 416 x 416 x 16 -> 208 x 208 x 16
2 conv 32 3 x 3 / 1 208 x 208 x 16 -> 208 x 208 x 32
3 max 2 x 2 / 2 208 x 208 x 32 -> 104 x 104 x 32
4 conv 64 3 x 3 / 1 104 x 104 x 32 -> 104 x 104 x 64
5 max 2 x 2 / 2 104 x 104 x 64 -> 52 x 52 x 64
6 conv 128 3 x 3 / 1 52 x 52 x 64 -> 52 x 52 x 128
7 max 2 x 2 / 2 52 x 52 x 128 -> 26 x 26 x 128
8 conv 256 3 x 3 / 1 26 x 26 x 128 -> 26 x 26 x 256
9 max 2 x 2 / 2 26 x 26 x 256 -> 13 x 13 x 256
10 conv 512 3 x 3 / 1 13 x 13 x 256 -> 13 x 13 x 512
11 max 2 x 2 / 1 13 x 13 x 512 -> 13 x 13 x 512
12 conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024
13 conv 1024 3 x 3 / 1 13 x 13 x1024 -> 13 x 13 x1024
14 conv 425 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 425
15 detection
Loading weights from tiny-yolo.weights...Done!
data/person.jpg: Predicted in 1452 ms.
But running the tiny yolo
I only tested tiny YOLO with COCO dataset. Tiny YOLO with dataset that has small classes cannot detect objects in Raspberry Pi 3. I couldn't find the reason yet. But there is a tip. I used 4 classes(person, dog, cat, car) with reduced tiny YOLO. First, I reduced number of feature map by 2/3. Second, I used 1x1 convolution from 3rd layer to last layer. My model takes 600ms and recall is about 78 at the VOC 2007 test dataset.
@thomaspark-pkj Thank you, that sounds like a good idea. Would it be possible for you to share the cfg and weights for the 4 classes (person, dog, cat, car) reduced tiny yolo?
@ashwinnair14 I only can share cfg file. Trained weights file is not mine but company's.
[net] batch=1 subdivisions=1 width=416 height=416 channels=3 momentum=0.9 decay=0.0005 angle=0 saturation = 1.5 exposure = 1.5 hue=.1
learning_rate=0.001 max_batches = 120000 policy=steps steps=-1,100,80000,100000 scales=.1,10,.1,.1
[convolutional] batch_normalize=1 filters=12 size=3 stride=1 pad=1 activation=leaky
[maxpool] size=2 stride=2
[convolutional] batch_normalize=1 filters=24 size=3 stride=1 pad=1 activation=leaky
[maxpool] size=2 stride=2
[convolutional] batch_normalize=1 filters=12 size=1 stride=1 pad=1 activation=leaky
[convolutional] batch_normalize=1 filters=48 size=3 stride=1 pad=1 activation=leaky
[maxpool] size=2 stride=2
[convolutional] batch_normalize=1 filters=24 size=1 stride=1 pad=1 activation=leaky
[convolutional] batch_normalize=1 filters=96 size=3 stride=1 pad=1 activation=leaky
[maxpool] size=2 stride=2
[convolutional] batch_normalize=1 filters=48 size=1 stride=1 pad=1 activation=leaky
[convolutional] batch_normalize=1 filters=192 size=3 stride=1 pad=1 activation=leaky
[maxpool] size=2 stride=2
[convolutional] batch_normalize=1 filters=96 size=1 stride=1 pad=1 activation=leaky
[convolutional] batch_normalize=1 filters=384 size=3 stride=1 pad=1 activation=leaky
[maxpool] size=2 stride=1
[convolutional] batch_normalize=1 filters=192 size=1 stride=1 pad=1 activation=leaky
[convolutional] batch_normalize=1 filters=768 size=3 stride=1 pad=1 activation=leaky
###########
[convolutional] batch_normalize=1 size=1 stride=1 pad=1 filters=192 activation=leaky
[convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=768 activation=leaky
[convolutional] size=1 stride=1 pad=1 filters=36 activation=linear
[region] anchors = 0.738768,0.874946, 2.42204,2.65704, 4.30971,7.04493, 10.246,4.59428, 12.6868,11.8741 bias_match=1 classes=4 coords=4 num=4 softmax=1 jitter=.2 rescore=1
object_scale=5 noobject_scale=1 class_scale=1 coord_scale=1
absolute=1 thresh = .6 random=1
I used COCO dataset.
- Convert COCO to PASCAL format. (https://gist.github.com/chicham/6ed3842d0d2014987186)
- Modify voc_label.py to use 4 classes.
- Run voc_label.py to generate label files.
- Train model
I'm sorry I cannot give modified files. Good luck :)
@thomaspark-pkj I totally understand that :+1: Will try your suggestions thanks.
I had a similar problem. I think the problem is in the darknet itself. Try the fork https://github.com/AlexeyAB/darknet
@ashwinnair14 Were you able to get Tiny YOLO working using NNPACK?
@sivagnanamn nope I trained my own tiny yolo networks on PC and ported it to the rasberry pi with darknet-nnpack it does not give results. I guess I should train my network on the NNPACK version of darknet and try again. Let me know If u have any success. But I did manage to make a smaller network for 4 classes with works reasonably well without NNpack.
@digitalbrain79 I'm a new about this argument. What tool can I use to train a custom network?
I use a classic darknet on the server https://github.com/pjreddie/darknet - all tests (network downloaded from the official site and new network trained on a single class) on the server are successful. I install on Raspberry a classic darknet - network and downloaded from the official site - it works successfully, but the new one trained in a single class does not produce any results. I install on the Raspberry fork https://github.com/AlexeyAB/darknet - everything works, but this framework does not have a high speed
@Rus-L Did you use https://github.com/AlexeyAB/darknet ? What commit are you using? I downloaded the last one and I have no detection, neither with a standard model (yolo-voc)..
@EnricoBeltramo I used the latest version of https://github.com/AlexeyAB/darknet Checked as follows: ./darknet detector test cfg / voc.data tiny-yolo-voc.cfg tiny-yolo-voc.weights data / person.jpg - I got a positive result I also checked on my version of the network, and also got a successful test
Weight files have a version specified by the 1st DWORD (counting from 0) of either 1 or 2. If it's a version 1 weight file, the images train count is stored as a 32-bit int; if it's a version 2 weight file, they are stored as a 64-bit int.
See src/parser.c - sizeof(size_t) varies by architecture. If compiling on 32-bit arm, it will evaluate to 32 bit / 4 bytes when it should evaluate to 64 bit / 8 bytes.
After fixing this bug, I have gotten successful detection predictions on a Pi 3 from tiny-yolo using both versions of weight files. ~~However, I cannot seem to get predictions from classifier networks.~~ Fixed by migrating changes from examples/detector.c to examples/classifier.c
@shizukachan and @Rus-L , can you tell me how to fix this issue? I use pjreddie's newest code.
I have make some change as AlexeyAB's code, (change sizeof(size_t) into sizeof(int)), bure-compile it, but nothing helps. When running YOLO detector, it works, but running tiny yolo detector, no results.
THX
Try sizeof(uint64_t)
@shizukachan , thank you, but it does not work. I replaced all sizeof(size_t) to sizeof(uint64_t) in src/parser.c
Only change this line's size_t to uint64_t https://github.com/pjreddie/darknet/blob/master/src/parser.c#L1114
@shizukachan and don't forget to re-make the project
Hello
@jinyu121 did you get it work? I am facing the same problem (tiny-yolo), the RasPi 3B is running but dont give me any result. @shizukachan Sry for the simple question, but how can I re-make the project?
@phil100vol Nope. I gave up~ /sad
Waiting for your solution~
@shizukachan your comment helps me a lot. thx!
@phil100vol I successfully run yolov2 at Raspi 3
- just rm or mv existing darknet-nnpack
- git clone https://github.com/thomaspark-pkj/darknet-nnpack.git
- cd darknet-nnpack/src
- vi parser.c
- change like this. shizukachan/darknet-nnpack@fe963ca
- and rebuild darknet-nnpack