tensorflow-yolo-ios
tensorflow-yolo-ios copied to clipboard
Training / Converting Model
First off, thank you for putting this repo together!
In preparation for training my own model on YOLO and converting to TF I wanted to prove out the pattern by taking the TinyYOLO VOC weights/config and recreating the frozen memmapped graph that you provide in this repo. After going through the Darkflow docs as well as Tensorflow for Mobile Poets I came up with the following process for converting the graph. The graph loads and runs on my iPhone; however, it seems to just randomly identify non-objects (typically 20-30 per second) which inevitably runs out of memory and crashes the application. I'll provide the steps I took for converting the graph below. Could you please share how you created your graph or provide some feedback?
BTW, the first big red flag I see is when I freeze the TinyYOLO VOC weights using Darkflow my graph is 61MB, but yours is 180MB?!
Here's my process:
Grab TinyYOLO VOC weights and config (I've used PJ's config as well as yours, there are a few differences):
wget https://raw.githubusercontent.com/pjreddie/darknet/master/cfg/tiny-yolo-voc.cfg
wget https://pjreddie.com/media/files/tiny-yolo-voc.weights
Next I validate that the weights/configs work before freezing:
flow --model tiny-yolo-voc.cfg --load tiny-yolo-voc.weights --gpu 0.9 --json
Next I freeze the graph and validate it still identifies the objects (everything works great):
flow --model tiny-yolo-voc.cfg --load tiny-yolo-voc.weights --savepb --gpu 0.9
flow --pbLoad built_graph/tiny-yolo-voc.pb --metaLoad built_graph/tiny-yolo-voc.meta --imgdir sample_img/ --json --gpu 0.9
Then I optimize the graph for inference and test again (still works):
bazel-bin/tensorflow/python/tools/optimize_for_inference \
--input=/yoloz/tiny/built_graph/tiny-yolo-voc.pb \
--output=/yoloz/tiny/built_graph/optimized_graph.pb \
--input_names=input \
--output_names=output \
--frozen_graph=True
Then I quantize and round the graph and validate (still works, but some accuracy drops as expected):
bazel-bin/tensorflow/tools/quantization/quantize_graph \
--input=/yoloz/tiny/built_graph/optimized_graph.pb \
--output=/yoloz/tiny/built_graph/rounded_graph.pb \
--output_node_names=output \
--mode=weights_rounded
Finally I enable memory mapping in the graph:
bazel-bin/tensorflow/contrib/util/convert_graphdef_memmapped_format \
--in_graph=/yoloz/tiny/built_graph/rounded_graph.pb \
--out_graph=/yoloz/tiny/built_graph/mmapped_graph.pb
I've tried cutting out steps such as quantizing and memory mapping, but I still run into the same issue where it just randomly identifies non-existent objects. Here's an example from the iOS logs of it just seeing tons of non-existent objects:
48 boxes
cow 0.26528638082959 (-716.969760456415, -15.0664010845897, 2253.67373242581, 124.151942685188)
person 60.9147208802824 (406.176437143808, 247.405798095825, 109.771308640561, 79.8200096378618)
sheep 39.6371332656176 (701.706673330065, 173.24449355272, 162.208712026626, 0.201273832795577)
aeroplane 14.7475192095906 (422.813935954874, -672.132180748015, 0.435470956695015, 1851.95653674813)
bottle 34.4575969911148 (-3691.35341221254, 230.732243466384, 7407.03830472379, 118.209991155339)
bottle 15.252358611848 (297.912561621646, 253.622715234777, 103.666117939307, 156.487766640486)
person 2.02568703175324 (-1223.67370049141, 403.496686780806, 2378.22481550453, 17.7921825504054)
person 46.802289779278 (1029.42971074394, 59.8810085550473, 0.0603049292979604, 6.32803476229502)
person 5.02492657361699 (-242.108751411802, -179.995665610159, 768.943885578703, 172.626105837104)
bottle 8.12876115292863 (1.31502027136079, 112.241214824463, 83.795759190395, 137.321676756878)
bottle 17.4926418818899 (-2195.20377769767, 44.7168849437375, 5036.41495845117, 109.233492394215)
bottle 10.0492816390524 (-348.11391411164, -823.872093321802, 1358.45449526415, 976.078075378695)
horse 11.7777611150507 (-5199.91062956631, 209.564810465238, 10445.2793049104, 193.588701516892)
sofa 2.90329754699323 (-580.55927439087, -1201.57429169997, 605.32719492025, 2319.642035421)
sheep 2.31771116209555 (335.281302477022, -265.169598244711, 303.744002610247, 615.620747373669)
bus 10.6498479187201 (-24099.0875846469, -55.1337276379371, 48149.893631693, 531.781720515113)
person 2.18346198169684 (245.163570981687, -659.499791728503, 452.623426099545, 547.663504813206)
tvmonitor 0.347630301506097 (250.864021547363, 230.171159402745, 178.092564045225, 107.932911465985)
bottle 9.95446473408515 (578.688593486094, 225.090434384003, 29.7744054641031, 5.09407076904225)
cow 9.39130423078825 (751.656599451878, 494.282862500461, 1.09127926019101, 0.0376114695453585)
cow 16.8039449513635 (-636.942771411139, -4146.12791663897, 105.520065759871, 8539.38077769234)
cow 15.3530701945051 (10.8899058785902, -2004.09817131406, 28.3976730276271, 4464.49136457081)
cow 6.18686360683682 (-1872.86129217465, 317.97697050232, 4741.10130713052, 5.93505212236634)
cow 41.8923066953403 (-5090.64552303113, -433.732989685124, 11922.7837276564, 842.91523490483)
person 7.01425882730575 (196.017857693368, 10.762791441239, 0.29155214652136, 133.058044741693)
cow 21.7848502025759 (-215.860215658783, -249.177102701316, 267.580321846654, 392.007555619633)
tvmonitor 5.30925632140537 (528.546262539762, 472.202107788961, 27.6917343533151, 152.522967437084)
tvmonitor 5.94156905219603 (565.858886985312, -920.189522495108, 318.533761354198, 2161.87640869381)
bus 8.98683072242511 (-6645.9226178394, 139.156692264233, 14659.6244466323, 374.677935877221)
cow 8.49134918033451 (432.454714344806, -2111.24702880055, 1.80568608554533, 4125.33471449469)
cow 23.8006362877428 (473.681844977678, -9128.2631528842, 239.033443917267, 18220.9025116129)
cow 32.4949977552751 (138.42400573552, -18035.3928847779, 387.916661080797, 36465.6768464101)
person 8.77725236878837 (502.371345131116, 210.051040886286, 42.5160061576193, 20.3697629750902)
diningtable 4.05980503916737 (-2256.01118752883, 343.210714264253, 5658.08545452834, 55.0627622079917)
tvmonitor 10.4713313523678 (-1818.03733710679, 194.896162589751, 4932.72937689972, 177.472580818184)
tvmonitor 12.2691757717801 (-18778.2914532181, 128.871683567611, 37271.2059504048, 29.0849889945601)
pottedplant 14.22723577094 (-28173.1173244905, -208.326447582406, 56501.6760140867, 546.753047487035)
tvmonitor 13.8996401643171 (-11994.0013712802, 370.284081045691, 24092.6758117422, 19.6677094960988)
bottle 24.2488610989835 (-195.202285100031, 385.567696758806, 1132.05653811156, 12.4024115628028)
person 17.197114360953 (-451.579067209645, -418.735144345297, 0.0300114537946303, 1134.27591314155)
Any help you can provide is greatly appreciated! Thanks! 👍
@brianantonelli hmmm. That seems weird. My only doubt is that which yolo version are you using? The smaller one is tiny-yolo v1 I believe. And it seems you are using the 2nd version.
@brianantonelli did you resolve this problem? I faced the same issue
@KleinYuan could you give me a step-by-step solution to make .pb file for using in this repo. im using darkflow as your suggestion. i have spent a couple of days but still cannot get any success. Thanks in advantage!
@macro-dadt Sure I am happy to do that. Before doing that, could you elaborate which specific step you get stuck? Darkflow should be very straightforward and only trick is that make sure you use the correct version of tiny-yolo (v1).
thank you for your quick reply. this is what i did:
- in cfg/v1 i create my own cfg named "yolo-tiny2c" (i copy from yolo-tiny4c then only change classes=4 -> classes=2 , output=686 -> output=588)
- create VOC dataset with labelImg
- train with my own dataset
flow --model cfg/v1/yolo-tiny2c.cfg --load bin/tiny-yolo-voc.weights --train --annotation train/v1/Annotations --dataset train/v1/Images
- saved graph and weights to protobuf file flow --model cfg/v1/yolo-tiny2c.cfg --load bin/tiny-yolo-voc.weights --savepb
- testing with some images by loading .pb and .meta file. So far everything works great.
- Next freeze graph
bazel-bin/tensorflow/python/tools/freeze_graph \ --input_graph=/Users/macro/yolo-tiny2c.pb \ --input_checkpoint=/Users/macro/yolo-tiny2c.pb-1000 \ --output_node_names=output \ --input_binary \ --output_graph=graph.pb
i got this errorUnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 51: invalid start byte
. i use tensorflow 1.3.0 and python 3.6 Did i do somethings wrong? Im very new in this area please help me!!! Thanks a lot in advantage
@macro-dadt this error usually occurs when you are using a different version of tensorflow to freeze a graph. Have you tried different versions? And also the bazel-bin/tensorflow version may not be the same as your default python's tensorflow, which run previous steps. It seems a darkflow issue. However, if I were you, I will just hack the function in darkflow which save the model with retrieving the tensors you need. It's just tensorflow, right? You can do it easily buddy. Example: https://github.com/KleinYuan/cnn/blob/master/tools/graph_freezer.py#L20
I used the commands listed by @brianantonelli to convert my yolo v1 model (.cfg and .weights) into a frozen graph and its meta file
flow --model tiny-yolo-voc.cfg --load tiny-yolo-voc.weights --savepb --gpu 0.9
I have 2 questions:
- How do I process the output which is a 588 neuron long ( 2 classes, side 7 x 7, num of boxes per cell = 2)
- How can I successfully port this frozen graph to be used by opencv, because I face the following error:
cv2.error: OpenCV(3.4.2) /io/opencv/modules/dnn/src/dnn.cpp:401: error: (-2:Unspecified error) Can't create layer "truediv" of type "RealDiv" in function 'getLayerInstance'
If anybody has faced the above 2 problems, or knows a way to solve it, I would really appreciate the help.