MegaDepth
MegaDepth copied to clipboard
did this has caffe2 mxnet or tensorflow type model?
pytorch model perfomance is a litte slow, did this has other online type model?
I have a tensorflow version model.
@CR-Ko do you convert use onnx?
No. I wrote my own converter to port pytorch model to tf one.
Do you share TF model somewhere? Can you share it, please?
@CR-Ko Can you share the model of TF? thank you very much!
use the newest pytorch and onnx,it's very easy to conver. and i test the model use pytorch jit feature, the performancespeed up 100%. when use c++ jit interface much more fast
@mlinxiang which versions do you use? I have tried several days ago with latest versions with no success. BatchNorm layer was converted with error
pytorch1.0
this is the code and converted tf pb model
pytotf.zip but the performance is not very well. suggest use pytorch 1.0 's jit feature, the speed is much more well as code follow:
demo2cplusplus.py.zip
note that when use convert use need to modify the follow models code, because of dataparralle
@mlinxiang thank you for sharing TF model!
I've done everything like you, also remove .cuda()
call. After it load onnx model for check:
onnx.checker.check_model(onnx_model)
Get error on check with BatchNorm layer. Maybe it some how depend on os system or specific versions of libs.
Pytorch 1.0, Onnx 1.3, MacOS 10.14.1
i did all things under ubuntu,and everything go well,maybe you can try under a linux env
@mlinxiang do you have success with load this model in OpenCV::DNN? There problem with NHWC data_format
required.
@mlinxiang Hey, I don't know how you have done conversation bellow, because when I run your script I got another pb file: in yours NCHW format is used, in mine I got NHWC (which could be run on mobile devices aka tflite or OpenCV::DNN). I compiled both onnx and onnx-tensorflow from sources, probably I got newer version than you have.
@gordinmitya could you please share your pb (tflite) file please? Im trying to run this on mobile too but have not had any success.
Have Fun ! https://github.com/CR-Ko/MegaDepth_Tensorflow
To speedup, there are some minor changes.
- merge pro- and post-processing into model.
- change input size.
@CR-Ko Thanks a lot. I just ran your version on python 2.7 using the demo image, but got this gradient image output instead. Am i missing something?
More info plz. Have u try python3?
I think u can just :
- git clone
- python3 inference_mega_tensorflow.py
@CR-Ko After switching to python 3, it works properly. Now i can try porting it to ios, maybe coreML. You mentioned you had tested it on mobile. How did you run it? Convert to tflite or via opencv::dnn?
CoreML and Tensorflow Lite both are converted successfully. The way I handling with pre- and post-processing can speedup inference on CoreML a bit.
What kind of function do u wanna implement on mobile?
@CR-Ko I want to run an image through the coreml model on ios and see how good is the performance. My aim is to pass an image and get a depth map back. I'll try converting it to onnx, then to coreml. Can you tell me the input/output names you used for the graph or do you have a sample of the coreml file? Im not sure of how fast it will be on ios devices though.
One more question, i noticed that you loop 10 times for this part and i did not understand why. Can you elaborate on it?
depth = sess.run(mega_out, feed_dict={imag_pl: img})
- No need to use ONNX, it is enough to freeze ckpt to pb then convert to CoreML and tflite.
- There are Tenforflow APIs can get the input/out op name or you can just print all names of ops out.
- Loop 10 times is just a quick way to measure the average inference time.
@CR-Ko i already printed the names but there's quite a lot:
Placeholder
scalar
Mul
0/conv2d/kernel/Initializer/random_uniform/shape
0/conv2d/kernel/Initializer/random_uniform/min
0/conv2d/kernel/Initializer/random_uniform/max
0/conv2d/kernel/Initializer/random_uniform/RandomUniform
0/conv2d/kernel/Initializer/random_uniform/sub
0/conv2d/kernel/Initializer/random_uniform/mul
0/conv2d/kernel/Initializer/random_uniform
0/conv2d/kernel
0/conv2d/kernel/Assign
0/conv2d/kernel/read
0/conv2d/bias/Initializer/zeros
.
.
.
save/Assign_618
save/Assign_619
save/Assign_620
save/Assign_621
save/Assign_622
save/Assign_623
save/restore_all
I guess the input is Placeholder
and output is save/restore_all
. Sorry for asking so many questions but im not so experienced in Neural networks.
To freeze model:
# before inference run
output_graph_def = tf.graph_util.convert_variables_to_constants(
sess,
sess.graph.as_graph_def(),
["module/div_1"]
)
with tf.gfile.GFile("megadepth.pb", "wb") as f:
f.write(output_graph_def.SerializeToString())
Then with 'tf-coreml' try to convert to CoreML with no success. Error:
NotImplementedError: Unsupported Ops of type: Pack
@AlexandrGraschenkov can you show the code how you converted it with tf-coreml? What parameters did you pass for output feature names?
Just pick up those names form my memory and ancient code. Not sure those names will 100% correct.
Two thing should be mention are that: I have provided two different kind of ckpts:
- mega.ckpt : pre- and post processing are handling outside the model (comment out in inference code)
- mega_prepost.ckpt: pre- and post processing are handling inside the model (this can make your inference on iOS faster and cleaner)
You can try output_feature_names = ['module/div_1:0'] (for mega_prepost.ckpt) output_feature_names= ['module/4/conv2d/BiasAdd:0'] (for mega.ckpt) image_input_names = ['Placeholder:0'],
The other thing is image type of IO need to handle. (google this : convert_multiarray_output_to_image) It's pretty straightforward.
Thanks for such details. I'll try it soon
I managed to create the coreml file and load it on ios. The output returns a multi array which when i convert to an image is always black. I also modified the coreml file to output a pixelBuffer instead but got the same result. Either the prediction is not working properly or i am not converting the output the right way.
@mevinDhun If you look at demo.py
, you will see postprocess before output:
pred_log_depth = model.netG.forward(input_images)
pred_log_depth = torch.squeeze(pred_log_depth)
pred_depth = torch.exp(pred_log_depth)
pred_inv_depth = 1/pred_depth
pred_inv_depth = pred_inv_depth/np.amax(pred_inv_depth) # normalization
As result pixel values between 0..1. You need to scale up to 255 and after this create UIImage from data.
@AlexandrGraschenkov yes i know the pixel values are between 0 and 1. my issue is that when i got the output from the coreml prediction, it is an MLMultiArray with the following shape:
(Double 1 x 1 x 1 x 240 x 320 array)
I tried converting this to a uiimage using CoreMLHelpers but always end up with a solid black image. Even after reshaping it to (1 x 240 x 320), i still have same problem. I opened an issue here: https://github.com/hollance/CoreMLHelpers/issues/22
Have you been able to convert the coreml output to a valid grayscale UIImage?
@mevinDhun do you try to normalize 0..255 ? Cause images in iOS presented in this range of values