py-faster-rcnn icon indicating copy to clipboard operation
py-faster-rcnn copied to clipboard

py-faster-rcnn on new dataset

Open leefionglee opened this issue 9 years ago • 107 comments

Hi all: I would like to train py-faster-rcnn on my own dataset, but what is exactly the data format? i.e. the images, annotations, train, val? Can anyone post an example of annotation file here? As I found one post here: https://github.com/zeyuanxy/fast-rcnn/tree/master/help/train, in which the annotation is text, while I also found PASCAL is XML. What exactly it is? Any reference blogs or tutorials will be highly appreciated. Thanks

leefionglee avatar Nov 20 '15 05:11 leefionglee

pascal has an xml file...look at the original VOCdevkit code(readme and helper function)...Xml basically can be converted to a struct in matlab and u can see that it has information about the class name,bounding boxes etc..

athus1990 avatar Nov 30 '15 21:11 athus1990

hi~ @leefionglee I trained fastrcnn on other dataset lastweek~ weather xml or txt in annotation does not matter. you can see in path /fast-rcnn/lib/datasets/inria.py there is a func named _load_inria_annotation to deal with annotation file :) while i used xml get annotation

yileo19920925 avatar Dec 11 '15 08:12 yileo19920925

@yileo19920925 can you please share some insights on your training steps. is it as simple as getting the data in the format of VOC2007 trainval and test, and placing them into the right folder paths and just calling alt_opt.sh?

do we need to write any python code or modify any other files like FastrCNN training (factory.py etc?) thanks in advance

kshalini avatar Dec 25 '15 17:12 kshalini

@kshalini hi I trained my datasets using fast(not er) r cnn..... BUT i think whether fast or faster rcnn require imdb to make train process In order to deal with imdb we need to write or modify pascal_voc.py The function of factory.py is to add this kind of datasets like pascal_voc

when i trained my data .It helped me a lot (i hope it will be positive to u) https://github.com/coldmanck/fast-rcnn/blob/master/README.md

about alt_opt. i did not used it when trained In fast rcnn u can use ./tools/train_net.py In faster rcnn ./tool/train_faster_rcnn_alt_opt.py may be more straightforward

yileo19920925 avatar Jan 01 '16 08:01 yileo19920925

@leefionglee @athus1990 @kshalini

i trained with ImageNet, but met an error in /lib/rpn/anchor_target_layer.py.

#error
Process Process-1:
Traceback (most recent call last):
  File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "./tools/train_faster_rcnn_alt_opt.py", line 132, in train_rpn
    max_iters=max_iters)
  File "/home/lijiajun/py-faster-rcnn/tools/../lib/fast_rcnn/train.py", line 136, in train_net
    model_paths = sw.train_model(max_iters)
  File "/home/lijiajun/py-faster-rcnn/tools/../lib/fast_rcnn/train.py", line 104, in train_model
    self.solver.step(1)
  File "/home/lijiajun/py-faster-rcnn/tools/../lib/rpn/anchor_target_layer.py", line 137, in forward
    gt_argmax_overlaps = overlaps.argmax(axis=0)
ValueError: attempt to get argmax of an empty sequence

when run at "only keep anchors inside the image", len(ids_inside) is equal to 0.

    # only keep anchors inside the image
    inds_inside = np.where(
        (all_anchors[:, 0] >= -self._allowed_border) &
        (all_anchors[:, 1] >= -self._allowed_border) &
        (all_anchors[:, 2] < im_info[1] + self._allowed_border) &  # width
        (all_anchors[:, 3] < im_info[0] + self._allowed_border)    # height
    )[0]

how to solver?

leejiajun avatar Jan 14 '16 03:01 leejiajun

@leefionglee @kshalini Hi! I've train faster rcnn on imagenet (200 categories), hope this can help you! https://github.com/andrewliao11/py-faster-rcnn/blob/master/README.md

andrewliao11 avatar Feb 03 '16 13:02 andrewliao11

@leejiajun Hi, leejiajun, Did you solve this problem?I have the same problem.

banxiaduhuo avatar Feb 14 '16 09:02 banxiaduhuo

@leejiajun @banxiaduhuo Hi, did you solve the problem? I got the same issue when training ZF model on custom dataset

LiberiFatali avatar Feb 19 '16 09:02 LiberiFatali

@banxiaduhuo @LiberiFatali

Because, the ratio of images width and height is too small or large. You could remove that images and solve this problem.

leejiajun avatar Feb 22 '16 13:02 leejiajun

Right, some images have big ratio of width and height cause this.Thanks!

LiberiFatali avatar Feb 23 '16 04:02 LiberiFatali

@leejiajun How do you train imagenet dataset? Could you write some instructions for this step. Thank you.

ck196 avatar Mar 03 '16 06:03 ck196

@monkeykju all work focus on coding lib/datasets/pascal_voc.py and lib/datasets/factory.py you could read this, https://github.com/andrewliao11/py-faster-rcnn/blob/master/README.md

leejiajun avatar Mar 03 '16 09:03 leejiajun

@leefionglee can you tell me why the training image ratio would cause such an error? is that because RPN produce no roi which can be considered gt under that ratio? if yes, why too big or too small image would effect this thank you.

daf11865 avatar Mar 13 '16 17:03 daf11865

@leefionglee I do as the inria.py(two classes), but when i train my dataset, the machine always crash, i forget to prepare the negative images, this will cause the issue or there may other reason? such as: qq 20160315112658 How to make the negative images and annotation? thanks.

karenyun avatar Mar 15 '16 03:03 karenyun

I remember that this py-faster-rccn doesn't used selective search

LiberiFatali avatar Apr 15 '16 07:04 LiberiFatali

@MarkoArsenovic I wanna train the net on my own dataset. I don't understand how to modify the python files before training. I can't find any repo where those steps are explained.

aragon111 avatar Apr 20 '16 10:04 aragon111

Hi, Let's have a look there

deboc avatar Apr 20 '16 11:04 deboc

Thanks @deboc . I used coldmanck instructions for training the fast rcnn. The problem is that there are few steps which are differents (in the construct of the IMDB file and editing the shell script)

aragon111 avatar Apr 20 '16 11:04 aragon111

Ok tools/train_net.py is in fact an old script for fast-rcnn. You can directly launch the training without any script : (example for alt_opt training) $cd <py-faster-rcnn folder> $./tools/train_faster_rcnn_alt_opt.py --gpu 0 --net_name <model name> --weights <pretrained .caffemodel> --imdb <dataset name>_train

or just use tools/train_faster_rcnn_alt_opt.py

deboc avatar Apr 20 '16 14:04 deboc

Ok we definitely miss a real tutorial dedicated to py-faster-rcnn with a simple dataset. Chances are your inria.py need 2 more methods (rpn_roidb & _load_rpn_roidb). I'm sorry not to remember how exactly I built mine, but if you want to finish the job have a look here

deboc avatar Apr 21 '16 13:04 deboc

Hello @deboc , I tried running the training as you explained but after I typed

./train_faster_rcnn_alt_opt.py --gpu 0 --solver /models/pascal_voc/ZF/faster_rcnn_end2end/solver.prototxt --weights /data/faster_rcnn_models/ZF_faster_rcnn_final.caffemodel --imdb amphora_train

I got this error: train_faster_rcnn_alt_opt.py: error: unrecognized arguments: --solver /models/pascal_voc/ZF/faster_rcnn_end2end/solver.prototxt

I don't understand what is wrong with the solver I chose.

aragon111 avatar Apr 21 '16 16:04 aragon111

You are using the alt_opt script with a end2end solver, so it won't work.

For training alt_opt the argument is not --solver <solver.prototxt> but --net_name <model_name> which specify the model folder. Then train_faster_rcnn_alt_opt.py automatically looks for solvers in models/<model_name>/faster_rcnn_alt_opt/ where you should put solvers for alt_opt training.

deboc avatar Apr 21 '16 16:04 deboc

Thank you @deboc . I understood now and I edited the faster_rcnn_test file as I have done in fast-rcnn. I got now an error about the stage1_rpn_train file (which is in the right directory).

I0421 19:04:00.501624 20093 solver.cpp:61] Creating training net from train_net file: models/pascal_voc/ZF/faster_rcnn_alt_opt/stage1_rpn_train.pt F0421 19:04:00.501646 20093 io.cpp:34] Check failed: fd != -1 (-1 vs. -1) File not found: models/pascal_voc/ZF/faster_rcnn_alt_opt/stage1_rpn_train.pt

aragon111 avatar Apr 21 '16 17:04 aragon111

@MarkoArsenovic did you figure it out how to train the py-faster-rcnn using another dataset?

aragon111 avatar Apr 25 '16 11:04 aragon111

@MarkoArsenovic I'm in the same situation...I found all the necessary steps for fast rcnn but still can't get any solution for py-faster-rcnn

aragon111 avatar Apr 25 '16 12:04 aragon111

I'm updating zeyuanxy's tutorial for py-faster-rcnn, I'll let you know

deboc avatar Apr 25 '16 12:04 deboc

@deboc that sounds great! Thank you!

aragon111 avatar Apr 25 '16 13:04 aragon111

I wasn't aware of the last updates with the config, that puzzled me a little ;) Here is my contribution, I hope it will help: https://github.com/deboc/py-faster-rcnn/blob/master/help/Readme.md

deboc avatar Apr 26 '16 09:04 deboc

No you don't need to rename since the output dimension is not the same. Your issue looks really weird. Can you double check the 4 stage{something}train.pt files in your model folder ? The 84 shape in your error definitely seems to be an old 21x4...

deboc avatar Apr 26 '16 22:04 deboc

Hi Can someone help me with running this ResNet model?

I modified the ResNet prototxt file to have the ROI proposals and was able to fine tune it successfully in 50K iterations on val set Train net output #3: rpn_loss_bbox = 0.0179207 (* 1 = 0.0179207 loss) Now I am trying to use the final caffe model along with the deploy.prototxt for ResNet modified as a test prototxt (removing lr params, top layer and appending the input data layer). However I keep getting this error. File "./lib/rpn/anchor_target_layer.py", line 116, in forward (all_anchors[:, 2] < im_info[0][1] ) & # width ValueError: operands could not be broadcast together with shapes (17100,) (600,800)

I printed a few debug statements but was unsuccessful in trouble shooting. Any help would be appreciated.

nazneenrajani avatar Apr 27 '16 00:04 nazneenrajani

Hi,

Has anyone tried to parallelize the calls to im_detect. I am facing issues with Caffe Initialization when I make calls to the network. Exact error details are below

Background: I have pre-loaded the network and am passing the net parameter in pooled calls.

Error Statement:

F0428 16:30:55.227715 19959 syncedmem.hpp:18] Check failed: error == cudaSuccess (3 vs. 0) initialization error

Anubhav7 avatar Apr 28 '16 11:04 Anubhav7

@Anubhav7 Out of memory?I guess.

leejiajun avatar Apr 29 '16 08:04 leejiajun

@leejiajun No, it's not one of those times :)

Anubhav7 avatar Apr 29 '16 08:04 Anubhav7

Hi @deboc, Thank you for the very helpful training instructions in: https://github.com/deboc/py-faster-rcnn/blob/master/help/Readme.md Finally i successfully trained Inria dataset , But how i will do the test? I mean, I tried to use tools/test_net.py but it need test.prototxt file

anas-899 avatar May 15 '16 12:05 anas-899

Hi, How did you do the refactoring got testset. it is not clear how to create symlinks for the testset. Please help

sahuvaibhav avatar Jun 10 '16 13:06 sahuvaibhav

Hi all, Not sure that I am doing this the right way but it seems like the right places to ask some questions.

I am trying to get used to the training of the py-faster-rcnn and I have trained it with success on the B3DO dataset (though the results is only around 0.3 mAP) and my goal would be to perform object recognition using a Kinect on RGB an Depth separately. However, I have not come across articles dealing with object recognition using Depth only except some {bag of feature(HOG/ HDD) and classification} approaches. Is there any limitations I don't see that prevent considering using the faster RCNN on depth data only?

Also I tried to train the faster rcnn without using the pretrained models on imagenet. First on B3DO but success (I am assumed it was because of the size of the dataset), and then on pascal VOC (both end to end training and alt opt) but during the training part the loss does not seems to converge, and when I try to run the test_net.py I obtain an error whereas I could run it when I used the pretrained model : File "/home/benjamin/FastRCNN/py-faster-rcnn/tools/../lib/datasets/voc_eval.py", line 148, in voc_eval BB = BB[sorted_ind, :] IndexError: too many indices for array So if I would like not to use the pretrained model is there any steps aside not include --weights in the trainings options?

PS: I am using the ZF model If I am breaking any rules that I am unaware of, please tell me Any comment would be useful, thanks!

BenjaminTT avatar Jun 17 '16 04:06 BenjaminTT

Hi @deboc, Looking at the link below I followed the steps you have given for training faster RCNN using alternate optimization scheme on custom dataset. https://github.com/deboc/py-faster-rcnn/blob/master/help/Readme.md

I have 37 classes of data including background. I have changed the num_outputs accordingly. But I receive the following error again and again and can't find the mistake. I'll be extremely grateful if somebody can help me address this issue. im_proposals are calculated for all the training images and then an error is showing up as below in the attached image.

fasterrcnn error

tharuniitk avatar Jun 18 '16 17:06 tharuniitk

Hi, First are you sure to have removed the cache ? $ cd $ rm data/cache/voc_2016_train_gt_roidb.pkl $ rm output/faster_rcnn_alt_opt/voc_2016_train/vgg_*

deboc avatar Jun 19 '16 09:06 deboc

Hi @deboc I have cleared the cache. Now I have successfully trained the model. But I am meeting an error while testing same as that @BenjaminTT above. I am attaching it below. Grateful for your help.

Traceback (most recent call last): File "./tools/test_net.py", line 105, in test_net(net, imdb, max_per_image=args.max_per_image, vis=args.vis) File "/home/ashishkumar/py-faster-rcnn/tools/../lib/fast_rcnn/test.py", line 295, in test_net imdb.evaluate_detections(all_boxes, output_dir) File "/home/ashishkumar/py-faster-rcnn/tools/../lib/datasets/pascal_voc.py", line 341, in evaluate_detections self._do_python_eval(output_dir) File "/home/ashishkumar/py-faster-rcnn/tools/../lib/datasets/pascal_voc.py", line 304, in _do_python_eval use_07_metric=use_07_metric) File "/home/ashishkumar/py-faster-rcnn/tools/../lib/datasets/voc_eval.py", line 157, in voc_eval BB = BB[sorted_ind, :] IndexError: too many indices for array

tharuniitk avatar Jun 20 '16 02:06 tharuniitk

I also encountered the same problem as @tharuniitk

bityangke avatar Jun 20 '16 15:06 bityangke

Hi @deboc,

I had followed your guidance but I got the following error,

File "./tools/train_faster_rcnn_alt_opt.py", line 210, in <module> cfg_from_file(args.cfg_file) File "/home/caffe/py-faster-rcnn/tools/../lib/fast_rcnn/config.py", line 263, in cfg_from_file _merge_a_into_b(yaml_cfg, __C) File "/home/caffe/py-faster-rcnn/tools/../lib/fast_rcnn/config.py", line 235, in _merge_a_into_b raise KeyError('{} is not a valid config key'.format(k)) KeyError: 'MODEL_DIR is not a valid config key' I have no idea what did I do wrong. Could you please give me some advice?

tohnperfect avatar Jun 21 '16 01:06 tohnperfect

Hi @tohnperfect, cfg.MODEL_DIR has been added in February, maybe you need to update your repo ?

deboc avatar Jun 21 '16 08:06 deboc

@deboc I just downloaded this git last week.

tohnperfect avatar Jun 21 '16 09:06 tohnperfect

It's MODELS_DIR.

deboc avatar Jun 21 '16 10:06 deboc

@tharuniitk, @bityangke: I think you're using the test step of voc dataset, but you need to write your own for the new dataset. For my tutorial on inria-person dataset the test step was not implemented, which I have done now. You can refer to this commit

deboc avatar Jun 21 '16 13:06 deboc

@deboc I've checked the codes and they are the lastest version

tohnperfect avatar Jun 21 '16 14:06 tohnperfect

@tohnperfect Yes sorry. Like I said the parameter you are looking for is in fact "MODELS_DIR", not "MODEL_DIR". I think it's just that :)

deboc avatar Jun 21 '16 14:06 deboc

I see. Thank you very much @deboc.

tohnperfect avatar Jun 21 '16 15:06 tohnperfect

@deboc Are you planning to write a tutorial about testing on new datasets as well? I guess it would be a really useful tutorial.

tohnperfect avatar Jun 21 '16 23:06 tohnperfect

hi @deboc, if the dataset is with the same format as pascal VOC the test implemented works, I trained on the B3DO dataset (with RGB only) and it worked (I obtain a mAP of ~0.32 ). However, problems appear when I am not using the pretrained model. In this case, whether I train on B3DO or pascal VOC, the test fail and even during training the loss does not seem to converge. Is there any specific step to do in order to avoid using the pretrained models ?

BenjaminTT avatar Jun 22 '16 07:06 BenjaminTT

@deboc Thank you very much. I am using the original VOC2007 datasets ,but I have changed the Annotations of trainval set(add more object boundings).

bityangke avatar Jun 23 '16 02:06 bityangke

@tohnperfect I dont think testing needs a complete tutorial but I can add some shell commands to do that in my existing training tutorial some day

@BenjaminTT I'm not sure why your test phase doesn't work, have you tried with a finetuned architecture like in my tuto (that is supposed to be more complicated with finetuning). For the complete training on pascal VOC... Well that's supposed to take some time! Have you considered to retrain the last layers only (setting the lower layers LR to 0) ? Debugging will be well faster then.

deboc avatar Jun 23 '16 16:06 deboc

@deboc, thank you for your insight, I am using the end to end training so far and after some investigation the reason the test does not work is because the text file with the detection is empty. I am not sure but I think it may be because even if the training ran without error, it did not succeed and the net is unable to perform the detections, I will also try with your method.

The reason I am interested with performing the complete training is because I am considering on doing it on Depth maps and it that case the Imagenet Model would not be relevant to use. So do you have any advise for this? I guess I should increase the number of iterations but do I need to modify the learning rate or other values ?

Thank you again.

BenjaminTT avatar Jun 24 '16 05:06 BenjaminTT

What do I need to provide in test.mat when it is loaded by _selective_search_roidb? How the AP is affected if I set __C.TEST.HAS_RPN to True?

tohnperfect avatar Jun 27 '16 01:06 tohnperfect

@BenjaminTT, @tharuniitk Did you find out the problem? I am facing the same error "BB = BB[sorted_ind, :] IndexError: too many indices for array" after executing the sh script to train and test the network. I am using VGG_CNN_M_1024, end2end configuration and pretrained imagenet_model . Also, using the original VOC2007 dataset. I noticed that the losses are going up and down and never converge (70000 iterations). Anyone has any idea what is going on?

Thank you!

fernandorovai avatar Jun 28 '16 22:06 fernandorovai

@fernandorovai : I have same problem with you. I just remove the initial weight in script file faster_rcnn_end2end.sh Any one know what happen?

tiepnh avatar Jun 30 '16 04:06 tiepnh

@fernandorovai @BenjaminTT @tharuniitk : I just get help from @DiegoPortoJaccottet , he said that the training must use the pre-train network. If we do not use the pre-train the network, even the caffemodel is created, it still cannot perform detection/classification.

@deboc : in your tuto I see that you use the pre-train network, but if I create the new network by myself, there are any way to train the faster rcnn without pre-train network or how to create pre-train network for the new network?

tiepnh avatar Jul 01 '16 08:07 tiepnh

Thank you @tiepnh, Does anyone knows if it is possible to find a way around? I still do not really understand why it is a necessity and not just something to speed up the learning

BenjaminTT avatar Jul 01 '16 09:07 BenjaminTT

To train with your own network, I think you need to define the network in Caffe prototxt first, then train it on labeled datasets to get Caffe model. After that, you can use that model(trained by Caffe) in py-faster-rcnn. You also need to insert RPN layer into original prototxt to make it work in py-faster-rcnn.

LiberiFatali avatar Jul 02 '16 09:07 LiberiFatali

Thank you very much for this informations @LiberiFatali. Can anyone can explain why the py-faster-rcnn can not be trained fully directly on the dataset that interest us? I do not see where is the constraint that impose to have a pretrained network. Tell me if I am wrong but if you do not precise the weights for the training aren't they generated with some functions for the initialization (randomly, or something else)? In that case the training should still be possible but more time consuming no?

BenjaminTT avatar Jul 12 '16 08:07 BenjaminTT

Hi @BenjaminTT, not sure your found the solution or not. In case your still have no answer, check the item #238 to train the network without pretrained network

tiepnh avatar Jul 18 '16 10:07 tiepnh

@leejiajun @LiberiFatali @banxiaduhuo

Hello, I encountered the same problem when training py-faster RCNN on imagenet data set. I am using this repository. In the dataset preparation code /tools/assemble_imagenet_train.py there is a section to filter out the BBOXes with inappropriate aspect ratio. Could you please give me the values you used for end2end training ?

raldam avatar Jul 20 '16 09:07 raldam

In regards to the error i encountered i will be very happy if someone helps me out I followed this link https://github.com/deboc/py-faster-rcnn/blob/master/help/Readme.md and after executing this command $ ./tools/train_faster_rcnn_alt_opt.py --gpu 1 --net_name INRIA_Person --weights data/imagenet_models/VGG_CNN_M_1024.v2.caffemodel --imdb inria_train --cfg ./config.yml I had this error:

Output will be saved to /home/sounak/py-faster-rcnn/output/default/train Filtered 0 roidb entries: 1228 -> 1228 WARNING: Logging before InitGoogleLogging() is written to STDERR F0823 18:42:11.847455 4797 io.cpp:36] Check failed: fd != -1 (-1 vs. -1) File not found: $/home/sounak/py-faster-rcnn/models/INRIA_Person/faster_rcnn_alt_opt/stage1_rpn_solver60k80k.pt *** Check failure stack trace: ***

Please help

sounakdey avatar Aug 23 '16 16:08 sounakdey

@sounakdey

  1. check your folder for the file, it says File not found : $/home/sounak/py-faster-rcnn/models/INRIA_Person/faster_rcnn_alt_opt/stage1_rpn_solver60k80k.pt
  2. The problem might be that you need to create the folder faster_rcnn_alt_opt/ under INRIA_Person, then copy all .pt files into it.
  3. modify the path in the train_net path of the file stage1_rpn_solver60k80k.pt, and other similar .pt files

useebear avatar Oct 18 '16 15:10 useebear

@BenjaminTT @fernandorovai

I wonder if you guys have found a way out of this problem. I have the exact problem, I have made a few changes on ZF (end2end) net to work with RGB and depth - sort of a fusion - but the structure is basically the same. (I am actually testing it on RGB and RGB, I wanted to make sure it works before I work with depth images) And indeed it does not even converge on the training even after 70000 iterations and the problem is raised because there are no detections at testing stage. None for majority of the classes ( and very very very accuracy low for a few). I would be really happy if you can share your insights and experience with this problem. Thanks.

duygusar avatar Oct 21 '16 14:10 duygusar

@duygusar Hi, it may be because the network is not actually learning on all the layers. You have to make sure that you initialize all of them (if you use the prototxt then some of them are not initialized because they use the weights of the pretrained models). If you have already done it and still have this problem try to increase a little the learning rate (since you are not using the pretrained model you sort of have more to learn). Hope this help.

BenjaminTT avatar Oct 23 '16 03:10 BenjaminTT

@BenjaminTT Thanks for the tip, I will try running with a higher learning rate. I have had recently tried initializing all convolution and innerproduct layers so I was able to make it work without any errors (the one caused during eval because of the empty detection sets) however I only have a a mAP of 0.25 even at 200.000 iterations.

duygusar avatar Oct 24 '16 15:10 duygusar

My problem that I haven't any evaluation :( ===> Index Error: too many indices . BB=BB[Sorted_ind,:]

I need help Thanks . ᐧ

2016-10-24 16:01 GMT+01:00 duygusar [email protected]:

@BenjaminTT https://github.com/BenjaminTT Thanks for the tip, I will try running with a higher learning rate. I have recently tried initializing all convolution and innerproduct layers so I was able to make it work without any errors (the one caused during eval because of the empty detection sets) however I only have a a mAP of 0.25 even at 200.000 iterations.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/rbgirshick/py-faster-rcnn/issues/27#issuecomment-255765945, or mute the thread https://github.com/notifications/unsubscribe-auth/ARcYFty1Az0IXVnRMBm_FQ_BgsXEnXbvks5q3Mg1gaJpZM4GmI20 .

mbarkiwafa avatar Oct 25 '16 18:10 mbarkiwafa

@mbarkiwafa It is most possible that your model is not converging either, the loss should decrease as the model trains, however it is not the case for me, it won't converge - no detections for one or more classes leads to that problem

duygusar avatar Oct 27 '16 17:10 duygusar

anyone tried training for two class using the inria dataset with "faster_rcnn_end2end.yml"?

hengck23 avatar Oct 28 '16 14:10 hengck23

Hello everyone, I try to train the Wider Face database on the py-faster-rcnn env. I had modified the database and finish the modification of relative files. Now the trainning can finished. But the mAP is always 0. Anyone know what happened? Thanks.

hawkjk avatar Nov 07 '16 12:11 hawkjk

@leejiajun @LiberiFatali @deboc Hi, I am training faster rcnn end-to-end with my own dataset. I always get core dump after a few iterations. I guess it is because the method _sample_rois in proposal_target_layer.py has no rois return since I printed out the array of rois and it is empty. Any suggestion that I can solve it or workaround?

In addition, I found that the gt_boxes seems different from the annotation I have, and seems flipped. I guess this is because I used __C.TRAIN.USE_FLIPPED = True in config.py. Does this matter?

Thank you in advance.

slchiang avatar Nov 11 '16 05:11 slchiang

In my case, I modified https://github.com/rbgirshick/py-faster-rcnn/blob/master/lib/datasets/imdb.py#L218 and changed this line if gt_roidb is not None and gt_roidb[i]['boxes'].size > 0: to if gt_roidb[i] is not None and gt_roidb[i]['boxes'].size > 0:

LiberiFatali avatar Nov 11 '16 07:11 LiberiFatali

@LiberiFatali Thanks for your quick response. This seems not work for me, if _sample_rois in ProposalTargetLayer gives empty rois then the code will crash when it comes to top[0].reshape(*rois.shape) in the forward function of ProposalTargetLayer. Do you know where is the code calling ProposalTargetLayer or how can I deal with this case so that it skips current bottom data and fetch a new bottom data?

Thank you.

slchiang avatar Nov 11 '16 08:11 slchiang

@leejiajun I changed my dataset,but I can train it on the original net at first,but when I changed the net, it can not work.The same dataset, but can not work on different net,I got this problem gt_argmax_overlaps = overlaps.argmax(axis=0) ValueError: attempt to get argmax of an empty sequence do u have any solvers?

franciszzj avatar Nov 18 '16 04:11 franciszzj

@slchiang How about using

def filter_roidb(roidb):
    """Remove roidb entries that have no usable RoIs."""

at https://github.com/rbgirshick/py-faster-rcnn/blob/master/lib/fast_rcnn/train.py#L127 ?

LiberiFatali avatar Nov 18 '16 07:11 LiberiFatali

@LiberiFatali Thanks for your reply. But I guess this method is already called in train_net? Do you mean I need to call this method at somewhere else? So far, I started fine-tuning with low TRAIN.FG_THRESH and TRAIN.BG_THRESH_HI, and I increased the threshold gradually. Unfortunately, the loss does not decrease, and the result model is not good thought. Any suggestion for improving training results?

Thank you.

slchiang avatar Nov 18 '16 09:11 slchiang

@hawkjk hi, I also try to train my dataset on the py-faster-rcnn. The training can be finished. But the mAP is always 0. Do u have any solvers? Anyone know what happened? Thanks.

hongkaichen avatar Nov 21 '16 12:11 hongkaichen

@BenjaminTT @fernandorovai @duygusar I've encountered the same problem of IndexError: too many indices for array trying to train on my own dataset. Have you solved it? It's pretty wierd and seemingly has something to do with how it writes the results. detections.pkl isn't empty but all of the results txt files are. The net also doesn't converge but I guess it's more of a finetuning problem rather then configuration (although I may be wrong on that).

SHEKOLDA avatar Nov 23 '16 23:11 SHEKOLDA

This is not quite true @yileo19920925, the demo from @deboc assumes that your annotations files have .txt format. This can be verified by looking at the method _load_inria_annotation documentation, namely

Load image and bounding boxes info from txt files of INRIAPerson.

@leefionglee If your application uses .xml formatted annotations file, then take a look at pascal_voc.py and voc_eval.py

dantp-ai avatar Dec 05 '16 14:12 dantp-ai

try this: https://github.com/xinleipan/py-faster-rcnn-with-new-dataset

xinleipan avatar Feb 26 '17 05:02 xinleipan

@Franciszzj I met the same error to you, do you have solved it?

hanjf12 avatar Feb 27 '17 08:02 hanjf12

@BenjaminTT Did you solve your problem by initializing the convolution and mult layers with random weights?

vabrishami avatar Apr 24 '17 16:04 vabrishami

@vabrishami Yes for me the solution was to make sure all the weight were initialized and then the training would work. It even worked on a dataset of depth map that were registered as png (so all the three channels were the same)

BenjaminTT avatar Apr 25 '17 01:04 BenjaminTT

@BenjaminTT Is it possible for you to share your train.prototxt?

vabrishami avatar Apr 25 '17 10:04 vabrishami

@vabrishami Here it is, I hope it will help you (this is for 19 classes + background). train.txt

BenjaminTT avatar Apr 26 '17 04:04 BenjaminTT

@BenjaminTT Thanks Benjamin. It was really useful for me. I have another question. Why you used three same channel? Why not a single channel image?

vabrishami avatar Apr 26 '17 10:04 vabrishami

@vabrishami It is just a matter of convenience, I am using 2 networks trained independently, one on colour and one on Depth so it enables me to use the same file and also it enables to perform the training without preprocessing the images from the SUNRGBD dataset

BenjaminTT avatar Apr 27 '17 05:04 BenjaminTT

Hello everyone i am trying to train faster RCNN as per https://github.com/deboc/py-faster-rcnn/tree/master/help

When i try to train using the command ./tools/train_faster_rcnn_alt_opt.py --gpu 0 --net_name fishclassify --weights data/imagenet_models/VGG_CNN_M_1024.v2.caffemodel --imdb fishclassify --cfg /home/fast-rcnn/py-faster-rcnn/config.yml

I am getting the following error File "/fast-rcnn/py-faster-rcnn/tools/../lib/datasets/factory.py", line 46, in get_imdb raise KeyError('Unknown dataset: {}'.format(name))

Any help on what might be causing this? I had created fishclassify.py and fishclassify_eval.py under lib/datasets as well as modified factory.py

When i open the .py files in pycharm it shows unresolved import for "import datasets". Any idea where i had gone wrong?

indsak avatar Jun 13 '17 03:06 indsak

Hi, according to my tutorial

--imdb is the full name of the database as specified in the lib/datasets/factory.py file (nb: dont forget to add the test/train suffix !)

So in your case

--imdb fishclassify_train

deboc avatar Jun 13 '17 05:06 deboc

Thank you very much. Yesterday i had tried that..may be because of some other problem i was getting the same error. Now i tried your suggestion and the error is gone.But some AssertionError for the paths of the files. I will clear that. Thank you very much

indsak avatar Jun 13 '17 05:06 indsak

@deboc Hi, One more doubt, does the code strip the .xml extension in the newdataset.py file? When i run the code AssertionError coming for the image file because it searches for file.xml.jpg which is not present.

indsak avatar Jun 13 '17 06:06 indsak

Sorry but which .xml extension ? In my example of INRIA person dataset, I list .txt annotation files and write a train.txt index file with image names without extension (they are stripped with a sed command line). You should adapt that to your new dataset.

deboc avatar Jun 13 '17 06:06 deboc

@deboc ok.Thank you. I had written the train.txt with .xml extension. That was creating the issue. one more doubt. The name given the annotation file and those declared in self._classes should match right? I had written the abbreviated form in self._classes and the full form was retrieved from annotation files and it showed error. I hope both should match. Sorry to bother with silly doubts.

indsak avatar Jun 13 '17 06:06 indsak

@deboc Can you help me pls I am getting error at cls = self._class_to_ind[obj.find('name').text.lower().strip()]

in the newdataset.py file

In the self._classes i had declared the names as per the names given in the Annotation file. Its not entirely in small letters. Is this the problem?

indsak avatar Jun 13 '17 07:06 indsak

Please study carefully the tutorial applied to INRIA dataset before trying to use your dataset.

The method _class_to_ind required your annotations to match the index of the classes list (1 for pedestrian in the tutorial, forget about 0 it is internally used for background). So depending on your classes list you should have integers as annotations. You can also hack the newdataset.py instead, to match your annotations cls = obj.find('name').text.lower().strip() Should do the trick

deboc avatar Jun 13 '17 14:06 deboc

@deboc I successfully started training. But got the following error

line 118, in rpn_roidb if int(self._year) == 2007 or self._image_set != 'test': AttributeError: 'fishclassify' object has no attribute '_year'

Any help on how to solve this?

indsak avatar Jun 14 '17 05:06 indsak

Thank you for your help.

On Jun 13, 2017 8:19 PM, "Pierre de Beaucorps" [email protected] wrote:

Please study carefully the tutorial applied to INRIA dataset before trying to use your dataset.

The method _class_to_ind required your annotations to match the index of the classes list (1 for pedestrian in the tutorial, forget about 0 it is internally used for background). So depending on your classes list you should have integers as annotations. You can also hack the newdataset.py instead, to match your annotations cls = obj.find('name').text.lower().strip() Should do the trick

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/rbgirshick/py-faster-rcnn/issues/27#issuecomment-308141246, or mute the thread https://github.com/notifications/unsubscribe-auth/AWAQiRZbMOyLWMsG0_41SyLgXiQoNDv_ks5sDqFhgaJpZM4GmI20 .

indsak avatar Jun 14 '17 08:06 indsak

@deboc Any help on the below problem

line 118, in rpn_roidb if int(self._year) == 2007 or self._image_set != 'test': AttributeError: 'fishclassify' object has no attribute '_year'

Any help on how to solve this?

indsak avatar Jun 14 '17 11:06 indsak

@indsak Check your script TRAIN_IMDB="voc_2007_trainval" you need a year like 2007. Or you can modify the code in lib/datasets/factory.py when you register your own dataset and edit all parts with year in your own dataset reader like lib/datasets/pascal_voc.py

zhenni avatar Jul 24 '17 19:07 zhenni