DPNs
DPNs copied to clipboard
.rec must use the DPNs's Mxnet im2rec?
I use the new mxnet/tools/im2rec.py to produce .rec file, when I run the mxnet in your DPNs, error"Segmentation fault(core dumped)" is showed.
The provided MXNet in the DPN repo is a pretty old version which may not be compatible with .rec
generated by the newer version.
You can either generate the .rec
again by using the old MXNet [code], or simply add the customized data augmentations to the lastest MXNet and use the new one.
*Note: I personally recommend you to move to the latest MXNet since it would provide you much faster training and testing speed. But be careful with the official MXNet's default data augmentations, since it uses different strategies and may lead to poorer accuracy.
It always show this error .
Have you followed my recommendations above?
There are numerous mistakes that lead to Segmentation fault
.
Could you provide me with more details?
@cypw Thanks your answer.I use the old mxnet to train.
@cypw but I add the torch augmentations to the lastest MXNet,and make .
my operation is :
1/copy file image_aug_torch.cc to new_mxnet/src/io/
2/ rename image_aug_default.cc. as image_aug_default.cc.bk
3/ change the image_augmenter.h as follows
namespace mxnet { namespace io { /*! \return the parameter of default augmenter */ //std::vector<dmlc::ParamFieldInfo> ListDefaultAugParams(); std::vector<dmlc::ParamFieldInfo> ListTorchAugParams(); std::vector<dmlc::ParamFieldInfo> ListDefaultDetAugParams(); } // namespace io } // namespace mxnet #endif // MXNET_IO_IMAGE_AUGMENTER_H_
4/ make -j $(nproc) USE_OPENCV=1 USE_BLAS=openblas USE_CUDA=1 USE_CUDA_PATH=/usr/local/cuda USE_CUDNN=1
There are some error as follows:
src/io/image_aug_torch.cc:214:11: error: ‘cv::Mat mxnet::io::TorchImageAugmenter::Process(const cv::Mat&, mxnet::common::RANDOM_ENGINE*)’ marked ‘override’, but does not override cv::Mat Process(const cv::Mat &src, ^ src/io/image_aug_torch.cc: In lambda function: src/io/image_aug_torch.cc:429:36: error: invalid new-expression of abstract class type ‘mxnet::io::TorchImageAugmenter’ return new TorchImageAugmenter(); ^ src/io/image_aug_torch.cc:141:7: note: because the following virtual functions are pure within ‘mxnet::io::TorchImageAugmenter’: class TorchImageAugmenter : public ImageAugmenter { ^ In file included from src/io/image_aug_torch.cc:11:0: src/io/./image_augmenter.h:41:19: note: virtual cv::Mat mxnet::io::ImageAugmenter::Process(const cv::Mat&, std::vector<float>*, mxnet::common::RANDOM_ENGINE*) virtual cv::Mat Process(const cv::Mat &src, std::vector<float> *label, ^ src/io/image_aug_torch.cc: At global scope: src/io/image_aug_torch.cc:430:4: error: no matching function for call to ‘mxnet::io::ImageAugmenterReg::set_body(mxnet::io::<lambda()>)’ }); ^ In file included from /home/shipeng/mxnet/nnvm/include/nnvm/./base.h:13:0, from /home/shipeng/mxnet/nnvm/include/nnvm/op.h:16, from include/mxnet/base.h:33, from src/io/image_aug_torch.cc:6: /home/shipeng/mxnet/dmlc-core/include/dmlc/registry.h:165:21: note: candidate: EntryType& dmlc::FunctionRegEntryBase<EntryType, FunctionType>::set_body(FunctionType) [with EntryType = mxnet::io::ImageAugmenterReg; FunctionType = std::function<mxnet::io::ImageAugmenter*()>] inline EntryType &set_body(FunctionType body) { ^ /home/shipeng/mxnet/dmlc-core/include/dmlc/registry.h:165:21: note: no known conversion for argument 1 from ‘mxnet::io::<lambda()>’ to ‘std::function<mxnet::io::ImageAugmenter*()>’ Makefile:275: recipe for target 'build/src/io/image_aug_torch.o' failed make: *** [build/src/io/image_aug_torch.o] Error 1 make: *** Waiting for unfinished jobs....
@shipeng-uestc If you are using the provided old MXNet @ 92053bd, please do the following tests to debug:
Step 1: Make sure your train.rec
and val.rec
are correct and your MXNet is good to go.
This can be verified by simply running the testing code on your *.rec
files (both train.rec
and val.rec
).
Step 2: Check if you have used the iterator correctly. Here, I give you an example:
# data iter
def get_data_iter(args, kv):
mean_r = 124
mean_g = 117
mean_b = 104
data_shape = (3, 224, 224)
train = mx.io.ImageRecordIter(
data_name = 'data',
label_name = 'softmax_label',
# ------------------------------------
path_imgrec = os.path.join(args.data_dir, "train.rec"),
aug_seq = 'aug_torch',
label_width = 1,
data_shape = data_shape,
force2color = True,
preprocess_threads = 15,
verbose = True,
num_parts = 1,
part_index = 0,
shuffle = True,
shuffle_chunk_size = 1024,
shuffle_chunk_seed = kv.rank,
# ------------------------------------
batch_size = args.batch_size,
# ------------------------------------
rand_mirror = True,
mean_r = mean_r,
mean_g = mean_g,
mean_b = mean_b,
scale = 0.0167,
seed = kv.rank,
# ------------------------------------
rand_crop = True,
min_aspect_ratio = 0.7500,
max_aspect_ratio = 1.3333,
min_random_area = 0.08,
max_random_area = 1.0,
random_h = 20,
random_s = 40,
random_l = 50,
fill_value = (mean_r, mean_g, mean_b),
inter_method = 2 # 1-bilinear 2-cubic 9-auto
)
val = mx.io.ImageRecordIter(
data_name = 'data',
label_name = 'softmax_label',
# ------------------------------------
path_imgrec = os.path.join(args.data_dir, "val.rec"),
aug_seq = 'aug_torch',
label_width = 1,
data_shape = data_shape,
force2color = True,
preprocess_threads = 4,
verbose = True,
num_parts = kv.num_workers,
part_index = kv.rank,
# ------------------------------------
batch_size = args.batch_size,
# ------------------------------------
rand_mirror = False,
mean_r = mean_r,
mean_g = mean_g,
mean_b = mean_b,
scale = 0.0167,
seed = 0,
# ------------------------------------
rand_crop = False,
min_random_area = 0.765625,
max_random_area = 0.765625,
fill_value = (mean_r, mean_g, mean_b),
inter_method = 2 # 1-bilinear 2-cubic 9-auto
)
return (train, val)
As for moving to the latest MXNet, I haven't tried it yet. But it seems that the newer MXNet added another input argument (i.e. std::vector<float> *label
) in the Process
function #L59, so you may also need to add a std::vector<float> *label
just like what they did in their code image_aug_default.cc#L196.
@cypw thank you very much.But,why min_random_area = 0.765625.if I set it as follows:
rand_crop = False, min_random_area = 1, max_random_area = 1,
is it center crop?
I fix it to 0.765625
for evaluating a 224x224 center crop with all input images resized to short length = 256
.
Note: min_random_area = max_random_area = 0.765625 = (224^2)/(256^2)
.
Yes, If you set min_random_area = max_random_area = 1.
and rand_crop = False
, then you are using a center crop with all input images resized to short length = input length = 224
. In other words, the input will be a 224x224 center crop on an 224xN (or Nx224) image.
Thank you very much.it help me a lot. About min_random_area ,I always understood wrong.
I still have a question.Now ,I want use three DPNs-92 to ensemble.I have read some blogs, it says need different training data. Now,I use "unchanged = 1","resize = 395","resize =480"to produce three different train.rec.The input size is 320*320. Do you have a better suggestion?
@shipeng-uestc
The last two *.rec
files are unnecessary since all resizing can be done inside the data iterator.
Actually, I don't quite understand why do you need to use the training set with different scale to do the multi-model ensembling. Usually, the multi-model ensembling happens after the training phase and only involves validation and testing set.
@cypw Thank you.I want to get three differernt DPNs-92 models,then I use three models to predict the same test, in order to get higher accuracy.I read the Resnet paper, when they test, they use 2 resnet152 to get a higher accuracy