DPNs .rec must use the DPNs's Mxnet im2rec?

I use the new mxnet/tools/im2rec.py to produce .rec file, when I run the mxnet in your DPNs, error"Segmentation fault(core dumped)" is showed.

Oct 25 '17 04:10 shipengai

The provided MXNet in the DPN repo is a pretty old version which may not be compatible with .rec generated by the newer version.

You can either generate the .rec again by using the old MXNet [code], or simply add the customized data augmentations to the lastest MXNet and use the new one.

*Note: I personally recommend you to move to the latest MXNet since it would provide you much faster training and testing speed. But be careful with the official MXNet's default data augmentations, since it uses different strategies and may lead to poorer accuracy.

Oct 25 '17 05:10 cypw

It always show this error .

Oct 25 '17 07:10 shipengai

Have you followed my recommendations above?

There are numerous mistakes that lead to Segmentation fault. Could you provide me with more details?

Oct 25 '17 07:10 cypw

@cypw Thanks your answer.I use the old mxnet to train.

Oct 28 '17 08:10 shipengai

@cypw but I add the torch augmentations to the lastest MXNet,and make . my operation is ： 1/copy file image_aug_torch.cc to new_mxnet/src/io/ 2/ rename image_aug_default.cc. as image_aug_default.cc.bk 3/ change the image_augmenter.h as follows namespace mxnet { namespace io { /*! \return the parameter of default augmenter */ //std::vector<dmlc::ParamFieldInfo> ListDefaultAugParams(); std::vector<dmlc::ParamFieldInfo> ListTorchAugParams(); std::vector<dmlc::ParamFieldInfo> ListDefaultDetAugParams(); } // namespace io } // namespace mxnet #endif // MXNET_IO_IMAGE_AUGMENTER_H_ 4/ make -j $(nproc) USE_OPENCV=1 USE_BLAS=openblas USE_CUDA=1 USE_CUDA_PATH=/usr/local/cuda USE_CUDNN=1

There are some error as follows: src/io/image_aug_torch.cc:214:11: error: ‘cv::Mat mxnet::io::TorchImageAugmenter::Process(const cv::Mat&, mxnet::common::RANDOM_ENGINE*)’ marked ‘override’, but does not override cv::Mat Process(const cv::Mat &src, ^ src/io/image_aug_torch.cc: In lambda function: src/io/image_aug_torch.cc:429:36: error: invalid new-expression of abstract class type ‘mxnet::io::TorchImageAugmenter’ return new TorchImageAugmenter(); ^ src/io/image_aug_torch.cc:141:7: note: because the following virtual functions are pure within ‘mxnet::io::TorchImageAugmenter’: class TorchImageAugmenter : public ImageAugmenter { ^ In file included from src/io/image_aug_torch.cc:11:0: src/io/./image_augmenter.h:41:19: note: virtual cv::Mat mxnet::io::ImageAugmenter::Process(const cv::Mat&, std::vector<float>*, mxnet::common::RANDOM_ENGINE*) virtual cv::Mat Process(const cv::Mat &src, std::vector<float> *label, ^ src/io/image_aug_torch.cc: At global scope: src/io/image_aug_torch.cc:430:4: error: no matching function for call to ‘mxnet::io::ImageAugmenterReg::set_body(mxnet::io::<lambda()>)’ }); ^ In file included from /home/shipeng/mxnet/nnvm/include/nnvm/./base.h:13:0, from /home/shipeng/mxnet/nnvm/include/nnvm/op.h:16, from include/mxnet/base.h:33, from src/io/image_aug_torch.cc:6: /home/shipeng/mxnet/dmlc-core/include/dmlc/registry.h:165:21: note: candidate: EntryType& dmlc::FunctionRegEntryBase<EntryType, FunctionType>::set_body(FunctionType) [with EntryType = mxnet::io::ImageAugmenterReg; FunctionType = std::function<mxnet::io::ImageAugmenter*()>] inline EntryType &set_body(FunctionType body) { ^ /home/shipeng/mxnet/dmlc-core/include/dmlc/registry.h:165:21: note: no known conversion for argument 1 from ‘mxnet::io::<lambda()>’ to ‘std::function<mxnet::io::ImageAugmenter*()>’ Makefile:275: recipe for target 'build/src/io/image_aug_torch.o' failed make: *** [build/src/io/image_aug_torch.o] Error 1 make: *** Waiting for unfinished jobs....

Oct 28 '17 09:10 shipengai

@shipeng-uestc If you are using the provided old MXNet @ 92053bd, please do the following tests to debug:

Step 1: Make sure your train.rec and val.rec are correct and your MXNet is good to go.

This can be verified by simply running the testing code on your *.rec files (both train.rec and val.rec).

Step 2: Check if you have used the iterator correctly. Here, I give you an example:

 # data iter
 def get_data_iter(args, kv):
     mean_r = 124
     mean_g = 117
     mean_b = 104
     data_shape = (3, 224, 224)
     train = mx.io.ImageRecordIter(
         data_name           = 'data',
         label_name          = 'softmax_label',
         # ------------------------------------
         path_imgrec         = os.path.join(args.data_dir, "train.rec"),
         aug_seq             = 'aug_torch',
         label_width         = 1,
         data_shape          = data_shape,
         force2color         = True,
         preprocess_threads  = 15,
         verbose             = True,
         num_parts           = 1,
         part_index          = 0,
         shuffle             = True,
         shuffle_chunk_size  = 1024,
         shuffle_chunk_seed  = kv.rank,
         # ------------------------------------
         batch_size          = args.batch_size,
         # ------------------------------------
         rand_mirror         = True,
         mean_r              = mean_r,
         mean_g              = mean_g,
         mean_b              = mean_b,
         scale               = 0.0167,
         seed                = kv.rank,
         # ------------------------------------
         rand_crop           = True,
         min_aspect_ratio    = 0.7500,
         max_aspect_ratio    = 1.3333,
         min_random_area     = 0.08,
         max_random_area     = 1.0,
         random_h            = 20,
         random_s            = 40,
         random_l            = 50,
         fill_value          = (mean_r, mean_g, mean_b),
         inter_method        = 2    # 1-bilinear 2-cubic 9-auto
         )
     val = mx.io.ImageRecordIter(
         data_name           = 'data',
         label_name          = 'softmax_label',
         # ------------------------------------
         path_imgrec         = os.path.join(args.data_dir, "val.rec"),
         aug_seq             = 'aug_torch',
         label_width         = 1,
         data_shape          = data_shape,
         force2color         = True,
         preprocess_threads  = 4,
         verbose             = True,
         num_parts           = kv.num_workers,
         part_index          = kv.rank,
         # ------------------------------------
         batch_size          = args.batch_size,
         # ------------------------------------
         rand_mirror         = False,
         mean_r              = mean_r,
         mean_g              = mean_g,
         mean_b              = mean_b,
         scale               = 0.0167,
         seed                = 0,
         # ------------------------------------
         rand_crop           = False,
         min_random_area     = 0.765625,
         max_random_area     = 0.765625,
         fill_value          = (mean_r, mean_g, mean_b),
         inter_method        = 2    # 1-bilinear 2-cubic 9-auto
         )
     return (train, val)

As for moving to the latest MXNet, I haven't tried it yet. But it seems that the newer MXNet added another input argument (i.e. std::vector<float> *label) in the Process function #L59, so you may also need to add a std::vector<float> *label just like what they did in their code image_aug_default.cc#L196.

Oct 28 '17 13:10 cypw

@cypw thank you very much.But,why min_random_area = 0.765625.if I set it as follows: rand_crop = False, min_random_area = 1, max_random_area = 1, is it center crop?

Oct 28 '17 13:10 shipengai

I fix it to 0.765625 for evaluating a 224x224 center crop with all input images resized to short length = 256. Note: min_random_area = max_random_area = 0.765625 = (224^2)/(256^2).

Yes, If you set min_random_area = max_random_area = 1. and rand_crop = False, then you are using a center crop with all input images resized to short length = input length = 224. In other words, the input will be a 224x224 center crop on an 224xN (or Nx224) image.

Oct 28 '17 13:10 cypw

Thank you very much.it help me a lot. About min_random_area ,I always understood wrong.

Oct 28 '17 14:10 shipengai

I still have a question.Now ,I want use three DPNs-92 to ensemble.I have read some blogs, it says need different training data. Now,I use "unchanged = 1","resize = 395","resize =480"to produce three different train.rec.The input size is 320*320. Do you have a better suggestion?

Oct 28 '17 14:10 shipengai

@shipeng-uestc The last two *.rec files are unnecessary since all resizing can be done inside the data iterator.

Actually, I don't quite understand why do you need to use the training set with different scale to do the multi-model ensembling. Usually, the multi-model ensembling happens after the training phase and only involves validation and testing set.

Oct 28 '17 15:10 cypw

@cypw Thank you.I want to get three differernt DPNs-92 models,then I use three models to predict the same test, in order to get higher accuracy.I read the Resnet paper, when they test, they use 2 resnet152 to get a higher accuracy

Oct 30 '17 03:10 shipengai

DPNs DPNs copied to clipboard

.rec must use the DPNs's Mxnet im2rec?

DPNs
DPNs copied to clipboard