kaggle-dsb2-keras icon indicating copy to clipboard operation
kaggle-dsb2-keras copied to clipboard

Preprocessing speed is fast

Open tengpeng opened this issue 9 years ago • 2 comments

I am not sure whether or not it is suitable to open a new issue to discuss the speed of Preprocessing.

I notice that the reprocessing speed is so fast. It outperforms around 10x times to my another preprocessing code on the same data. That's impressive. I am curious how do you achieve that. Is there anything you avoid to do in your functions?

The preprocessing stage include what happens in the data.py and train.py.

tengpeng avatar Feb 17 '16 21:02 tengpeng

Since you didn't mention which other pre-processing techniques you currently use, I can't really say why my code outperforms yours :). Basically I use scipy and scikit-image libraries, and I guess it can perform even faster with OpenCV, but I can't offer any official benchmark for that claim.

jocicmarko avatar Feb 17 '16 22:02 jocicmarko

https://github.com/dmlc/mxnet/blob/master/example/kaggle-ndsb2/Preprocessing.py

I guess the function get_data might be the problem causes the performance issue, but I am not sure about that. Maybe need add timer for each function for testing. : )

tengpeng avatar Feb 18 '16 01:02 tengpeng