rude-carnie icon indicating copy to clipboard operation
rude-carnie copied to clipboard

run with video sequence

Open Ravisik opened this issue 7 years ago • 17 comments

Hello!

Thanks for all this beautifull work.

I was wondering how to load an image directly from a numpy array (grabbed from a video using opencv for instance) instead of a ".jpeg" file.

The goal is to run detection directly from a videofile.

    with tf.gfile.FastGFile(filename, 'r') as f:
        image_data = f.read()
    # Convert any PNG to JPEG's for consistency.
    if _is_png(filename):
        print('Converting PNG to JPEG for %s' % filename)
        image_data = coder.png_to_jpeg(image_data)

    image = coder.decode_jpeg(image_data)

I saw that these lines do the conversion from .jpeg file to "tensorflow encoded image". Is there a way to encode numpy array from opencv?

Thanks a lot, if I found something I will edit my post.

Ravisik avatar May 16 '17 23:05 Ravisik

Is the difficulty in your question about how to navigate around reading files, instead reading images, or about OpenCV? The way the ImageCoder is written, its assuming PNG or JPEG, but it is fully evaluating the JPEG decoder graph to a yield Tensor using session.run(). This is easy to go around, since its decoupled from the neural network graph. In this case, just dont do that, do your conversion instead. Please note that rude-carnie expects RGB data, so you might have to flip the bands (from BGR).

Are you using opencv in C++ or Python? In cv2, imread (and I believe most other functions) already use numpy as common currency:

http://stackoverflow.com/questions/10417108/what-is-different-between-all-these-opencv-python-interfaces#10425504

If you are using C++, maybe something like this gist would be helpful:

https://gist.github.com/kyrs/9adf86366e9e4f04addb

dpressel avatar May 18 '17 13:05 dpressel

Thanks for your answer!

The difficulty was to use an load a numpy array (i.e an image) already decoded from a video frame. I managed to solve this issues by encoding the numpy array in a buffer:


     img = cv2.imread(filename)
    
    encode_param = [int(cv2.IMWRITE_JPEG_QUALITY), 100]
    result, img_str = cv2.imencode('.jpg', img, encode_param)
    image_test = coder.decode_jpeg(img_str.tostring(order='C'))

image_test can be directly used by tensorflow. It will be usefull with a video flux for instance.

Ravisik avatar May 18 '17 14:05 Ravisik

I managed to read video frames, however I have a problem of memory leak.

Is there a way to optimize memory usage in tensorflow? I did the same implementation in Caffe and the memory didn't increased constantly.

EDIT

After some research, cause of memory leak in tensorflow can be various. I have not found what is the issue yet, if you have any idea it could be great :)

Thanks! Flo

Ravisik avatar May 22 '17 16:05 Ravisik

Are you by chance working out of a fork of rude-carnie that I can look at somewhere? This would help me understand what could be the issue. Otherwise, maybe you can post a gist and point me to it?

dpressel avatar May 23 '17 18:05 dpressel

Thanks for your response.

This a snippet of my implementation of rude carnie: https://gist.github.com/Ravisik/e482d27d03ee39d1a2e0c8323f8e6973

I didn't modify a lot of things: make_batch only convert a numpy to encoded jpeg.

The snippet of the main file used will run classify for each bbox (detections made before). For the moment, I use single look instead of multiple crops because it uses less memory, however the RAM usage is increasing constantly!

If you have a clue it will be great! I think that the problem is due to tensorflow, maybe a bad implementation that increase the graph's size at every iteration?

Thansk a lot ; Flo :)

Ravisik avatar May 23 '17 19:05 Ravisik

I think you probably are going to want to isolate out the major steps in your while loop to help identify what is going on. What happens to your memory when you comment out lines 42 to 61? When you isolate it, you may want to separate the image decode eval() line from the classifier session run() call line so you can comment the session run() out independently. Also, what happens if you remove both session.run() and the image batch eval() call?

Unfortunately, this snippet is only partial and I have no test data, but as the classifier graph should be more or less fixed and since you are having to allocate image buffers and decode data within the while loop, I suspect its not the classifier that is causing the problems.

dpressel avatar May 23 '17 19:05 dpressel

I did some tests.

If I only run "make_batch" the memory will increase as usual. Without it the memory is stable.

Moreover I try to run make_batch using my method and yours, the results is the same memory will increase constantly at each iteration!

Ravisik avatar May 23 '17 20:05 Ravisik

Ok, good we are getting somewhere! Lines 5 and 7 seem unnecessary -- if Im not mistaken, you already have an RGB numpy array. The only thing you probably have to do here then is resize (and not necessarily even this, since you could do in numpy) and standardize_image. So I think you could probably rework it to not even use the coder object. In that case, the only overhead would be if you continued to use standardize_image from tf.

dpressel avatar May 23 '17 20:05 dpressel

Even in the multilook you could probably use opencv if you determine that TF's graph is the source of inefficiency there (of course, you probably want to isolate further first before going to that trouble)

dpressel avatar May 23 '17 20:05 dpressel

Ok I will try to read numpy directly with TF! I am not used to it, how could I do?

However, If I read images in your way in make_batch, i.e:

filename = "C:\\Users\\test.jpg"
    with tf.gfile.FastGFile(filename, 'rb') as f:
        image_data = f.read()
    image = coder.decode_jpeg(image_data)

the result is the same, and memory will increase constantly.

Thanks a lot for your time :)

Ravisik avatar May 23 '17 20:05 Ravisik

feed_dict accepts numpy arrays for its placeholders. Just remember you still have few TF normalization operations you need to do upfront...

dpressel avatar May 23 '17 20:05 dpressel

Ok thanks!

I will see if I found a solution for this "memory leak". If you find why make_batch function increases the memory constantly please tell me! :)

Ravisik avatar May 23 '17 20:05 Ravisik

@Ravisik Good day, sir! Is it possible to see your code for video? For it i can show my for face detection from video by tensorbox! :3

Gagazet avatar Jun 02 '17 08:06 Gagazet

@Gagazet hey! I've put a snippet more upster: https://gist.github.com/Ravisik/e482d27d03ee39d1a2e0c8323f8e6973

Basically, I just changed the way of make_batch (in utils.py) reads the image. You just need to feed "classify" fonction with a numpy array. However I think that it can be upgradable, if you find a more elegant way please tell me!

I you need more infos about do not hesitate ;)

See you

Ravisik avatar Jun 02 '17 19:06 Ravisik

@Ravisik Thank you for it! I tried to do it with video, but doesn't work for me :( Is it all, what you changed? :3 I am just starting in programming, sorry for silly question's, but one week of work with it and no result for me :c

Gagazet avatar Jun 08 '17 11:06 Gagazet

Hey! If you tell me the error message maybe we should see what's going on. However, this is just a snippet, if you put this code directly it will not work! You have to adapt it for your own use. We can see that by mail maybe, [email protected]

Ravisik avatar Jun 08 '17 13:06 Ravisik

hello, please see my problem, thx https://github.com/dpressel/rude-carnie/issues/102

ucasiggcas avatar Nov 27 '19 13:11 ucasiggcas