dlib dlib::video

trafficstars

This is an ffmpeg wrapper to capture any kind of video "thing", including video files (MP4, AVI, etc), RTSP stream or webcam ( eg. /dev/video0). Note that using this file imposes an additional license, which is FFMPEG's LGPL v2 license. Also note that this adds an additional dependency on libavformat.(so/a), libavcodec.(so/a), libswresample.(so/a) and libswscale.(so/a). This is NOT ready yet. It's here as a placeholder for now so people can try it out, test, report back bugs, etc.

Jan 12 '21 12:01 pfeatherstone

@davisking any ideas on how to do unit testing for this? I imagine doing the whole trick of compressing, base64-ing and inserting into a header file isn't sensible for an mp4 video. The tests will probably want to include videos with weird source formats, glitches, small to large dimensions, and probably with quite a few frames (probs > 1000). Also, this can't be tested on Travis because of the libavformat dependency. Unless there is a way of doing that which i'm not aware of.

Jan 14 '21 08:01 pfeatherstone

@davisking any ideas on how to do unit testing for this? I imagine doing the whole trick of compressing, base64-ing and inserting into a header file isn't sensible for an mp4 video. The tests will probably want to include videos with weird source formats, glitches, small to large dimensions, and probably with quite a few frames (probs > 1000). Also, this can't be tested on Travis because of the libavformat dependency. Unless there is a way of doing that which i'm not aware of.

Yeah. It sure how to get Travis to do that. Honestly I would stick a super tiny like 0.2second video into base64 and use that. That’s enough to exercise a decent part of it. Like what part of the code you wrote can’t be tested that way?

I realize there are lots of formats but it’s not the code you have written that handles that stuff. I wouldn’t worry about trying to exhaustively test libavformat, just the code in dlib.

Jan 14 '21 12:01 davisking

@davisking any ideas on how to do unit testing for this? I imagine doing the whole trick of compressing, base64-ing and inserting into a header file isn't sensible for an mp4 video. The tests will probably want to include videos with weird source formats, glitches, small to large dimensions, and probably with quite a few frames (probs > 1000). Also, this can't be tested on Travis because of the libavformat dependency. Unless there is a way of doing that which i'm not aware of.

Yeah. It sure how to get Travis to do that. Honestly I would stick a super tiny like 0.2second video into base64 and use that. That’s enough to exercise a decent part of it. Like what part of the code you wrote can’t be tested that way?

I realize there are lots of formats but it’s not the code you have written that handles that stuff. I wouldn’t worry about trying to exhaustively test libavformat, just the code in dlib.

Yeah i get your point. We don't want to be unit-testing libavformat. That's hopefully already been done. I guess any length video can be used. There is nothing in the code that depends on time per se. Okidok. Now for the fun of choosing a 0.2s clip.

Jan 14 '21 13:01 pfeatherstone

The code also works out the box with rtsp and camera devices. Those would be harder to test in a unit test. Unless you fancy hosting a dummy rtsp stream somewhere that will be active until the end of time.

Jan 14 '21 13:01 pfeatherstone

The code also works out the box with rtsp and camera devices. Those would be harder to test in a unit test. Unless you fancy hosting a dummy rtsp stream somewhere that will be active until the end of time.

Yeah, would be neat, but I don't want to deal with that :)

Jan 14 '21 13:01 davisking

The natural follow up from this is gonna be dlib::video_writer, which won't require much effort (it's similar API calls). So I think i will reinstate the following header hierarchy :

dlib/video_io.h
dlib/video_io/video_capture.h
dlib/video_io/video_writer.h

I could either shuv dlib::video_writer in this PR or put it in its own PR. I prefer the latter. In any case, maybe worth getting the headers right.

Jan 14 '21 16:01 pfeatherstone

Yeah do it in a separate PR. More smaller PRs rather than few big ones is best :)

Jan 15 '21 03:01 davisking

i've added something so you can read the metadata of the video stream. You can then detect stuff like whether the video is rotated. Very useful. We could then correct for that rotation if required. Nice.

Jan 19 '21 13:01 pfeatherstone

At some point I will finish this. I'm using it all now and works fine but don't quite have the time to push it over the edge and make it Davis-approved. And making CMake detect ffmpeg nicely and consistently on all platforms is going to be a ball-ache.

Feb 13 '21 14:02 pfeatherstone

We will be able to make this way more configurable than opencv's wrappers. Like we can set CRF values, gop sizes, the list is endless. Haven't quite decided on how to provide a nice Api. Probably just provide a std::vector<std::pair<string,string>> of options. Otherwise use some sensible defaults.

Feb 13 '21 14:02 pfeatherstone

We will be able to make this way more configurable than opencv's wrappers. Like we can set CRF values, gop sizes, the list is endless. Haven't quite decided on how to provide a nice Api. Probably just provide a std::vector<std::pair<string,string>> of options. Otherwise use some sensible defaults.

Eh, don't do a stringly typed interface. That's like having void pointers. There has to be really clear user documentation saying exactly what can and can't be done. So unless ffmpeg already deals in this kind of stringly typed interface and you are just saying we forward that it's not a great idea. Even then it's not super hot. I'm open to an argument to the contrary. But I've never seen a stringly typed interface that wasn't just an excuse to not define the interface.

Feb 13 '21 22:02 davisking

By the way I was talking about a video encoder, not dlib::video_capture. For dlib::video_capture the options are minimal and API can be fairly tight.

Feb 13 '21 23:02 pfeatherstone

For a video encoder on the other hand, most of the codec specific options are forwarded by the libavformat API as strings. Like {"preset", "slow"} for H264. So however the dlib API handles options, it will have to forward them to libavformat as strings. And there are 100s of options depending on the codec (literally). So I think the easiest way to handle specific options is using strings and let libaformat do error handling if they are bad, which dlib can forward as exceptions. Or, we can do what opencv does which is allow no options and have a minimal API. I don't like that. or we go all out and allow the user to forward any codec specific options. They would need to know what they are doing and read the ffmpeg documentation carefully. My use case, and I'm sure other people will have it too, is to be able to play with options. Like I want to be able to tweak the bitrate, the CRF value, whether it's lossless, the number of b frames, etc. I think this level of tuning will set dlib apart from other libraries.

Feb 13 '21 23:02 pfeatherstone

Some options will be standard and common across all codecs, and therefore we can make them typed of course, even strongly typed if we can be bothered.

Feb 13 '21 23:02 pfeatherstone

Yeah, that all sounds good then. :)

Feb 15 '21 13:02 davisking

Just realised, the video capture, which also works as an RTSP client, can work as an RTSP push server if you set some additional flags. Awesome! if I update the encoder to also include the container format(at the moment it's just the encoder) it can act as an RTSP push client.

Feb 24 '21 13:02 pfeatherstone

And I think the encoder can also work as an RTSP server if I make the update.

Feb 24 '21 13:02 pfeatherstone

Nice! Have you checked if you can open GIF animated files? With cv::VideoCapture you can read them as if they were a video file. If that works, we will also have animated GIF support in dlib! :D

Feb 24 '21 13:02 arrufat

I believe so yes.

Feb 24 '21 15:02 pfeatherstone

You can use the video_capture class to open JPEG files if you want. It's just you will only be able to read 1 frame. Then it closes. That thing will open pretty much anything. FFMpeg is awesome.

Feb 24 '21 15:02 pfeatherstone

Having support for RTSP client/server/push is a fantastic bonus in all of this.

Feb 24 '21 15:02 pfeatherstone

i don't know when i will be able to finish this. i have to use YUV at the moment which dlib doesn't support. So i've taken this code, modified it to fit my purpose. At the moment, I don't have an incentive to finish this off properly as i'm not using it.

Feb 24 '21 15:02 pfeatherstone

Warning: this issue has been inactive for 35 days and will be automatically closed on 2021-04-10 if there is no further activity.

If you are waiting for a response but haven't received one it's possible your question is somehow inappropriate. E.g. it is off topic, you didn't follow the issue submission instructions, or your question is easily answerable by reading the FAQ, dlib's official compilation instructions, dlib's API documentation, or a Google search.

Apr 01 '21 08:04 dlib-issue-bot

I'm genuinely interested in this feature, but I don't think I'll have time soon to work on it, unfortunately :(

Apr 01 '21 09:04 arrufat

yeah me too. I mean it's very close to being ready. The biggest nuisance is going to be some Davis' approved cmake scripts for detecting ffmpeg, and checking its version, etc. Though that won't be necessary for the dlib library itself, since this class is header only, and it's up to the user to link to libavformat and so on, but for the dlib unit tests, we want a good cmake script for ffmpeg. FFmpeg doesn't ship with a cmake script. So might have to borrow the one in opencv or something. I never use cmake in my own projects, i just use netbeans 8.2, so i explicitly set the linker flags. So doing all the cmake stuff isn't really my strong suit.

Apr 01 '21 10:04 pfeatherstone

I also have a video encoder (e.g h264, vp9) and video muxer (e.g h264 + mp4, h264 + rtsp) that are ready to go, but that will be for a future PR.

Apr 01 '21 10:04 pfeatherstone

Na some basic cmake option is presumably fine. Where this kind of thing goes off the rails is trying to make it work reliably on windows since windows has no coherent conventions for linking to installed libraries. But I’m fine with just telling people it’s on them to link to it in windows.

Apr 01 '21 12:04 davisking

Warning: this issue has been inactive for 35 days and will be automatically closed on 2021-05-16 if there is no further activity.

If you are waiting for a response but haven't received one it's possible your question is somehow inappropriate. E.g. it is off topic, you didn't follow the issue submission instructions, or your question is easily answerable by reading the FAQ, dlib's official compilation instructions, dlib's API documentation, or a Google search.

May 07 '21 08:05 dlib-issue-bot

Warning: this issue has been inactive for 42 days and will be automatically closed on 2021-05-16 if there is no further activity.

If you are waiting for a response but haven't received one it's possible your question is somehow inappropriate. E.g. it is off topic, you didn't follow the issue submission instructions, or your question is easily answerable by reading the FAQ, dlib's official compilation instructions, dlib's API documentation, or a Google search.

May 14 '21 08:05 dlib-issue-bot

I want this so badly, that I will try to make it work with CMake (never done that before, though) before it gets closed and falls into oblivion :P

May 16 '21 07:05 arrufat

I've added support for finding FFMPEG using PkgConfig in CMake. @pfeatherstone would you mind giving me write access to this PR?

May 16 '21 09:05 arrufat

Yep. Gimme a sec

May 16 '21 11:05 pfeatherstone

@arrufat I think you should have write-access now.

May 16 '21 11:05 pfeatherstone

What would be really cool, is if you could specify where to look for ffmpeg. If using pkg-config under the hood, i imagine using PKG_CONFIG_PATH will suffice.

May 16 '21 11:05 pfeatherstone

Yes, I already pushed a PR: https://github.com/pfeatherstone/dlib/pull/1

May 16 '21 11:05 arrufat

Wow, a PR within a PR

May 16 '21 11:05 pfeatherstone

I merged. I'm not an authority on cmake I'm afraid. But I do think the API can be improved a lot. I worked for a while on a different version of this. I would be keen to introduce those changes. The main things would be to optionally read audio frames. That might require a well thought-out dlib type. I added some more run time checks and more constructor arguments to cater for decoder options (both video and audio), format/demuxer options and protocol options (if using a raw TCP muxer for example). We can either introduce all these later or do it now. If we do it now, the API will be slightly more fixed. If we do it later, this PR will get passed sooner but will inevitably lead to API changes at a later stage which @davisking is really not keen on.

May 16 '21 12:05 pfeatherstone

It might be easier to merge what we have now and do bite-size increments. But we might need to add something that says : "This is an unstable API. If you don't like it, tough". This could be a compiler warning for example to make it absolutely clear.

May 16 '21 12:05 pfeatherstone

Or we namespace the incremental versions. This could be in namespace dlib::video_io::v1, and later versions which introduce breaking API changes could be in namespace dlib::video_io::v2 etc... I don't like this at all. I would rather have a compiler warning that says this is an unstable API and leave everything in the dlib namespace. It depends on what guarantees we want to impose on the API.

May 16 '21 13:05 pfeatherstone

Don’t worry about API versioning. I care about API stability in proportion to the age of the API, since that’s very correlated with the number of users. Like the question is always how many people will be impacted by a change and how difficult will it be for them to update. Moreover, some breaking changes are super easy for users to update and some are not. Like if the result is only direct users get a build error and how to update their code is manifestly obvious then that’s fine.

On the other hand, changes that silently cause runtime faults are not super great.

May 16 '21 21:05 davisking

One we could tackle this, is how GCC adds new C++ features to the standard library: they put them under the std::experimental namespace, so maybe we could add this stuff under dlib::experimental and move them to the main namespace... Not sure is a good idea, though...

Also, @pfeatherstone how am I supposed to open the webcam stream using this PR? I played a bit with it yesterday, but couldn't find a way...

May 17 '21 02:05 arrufat

It's been a while since i've used this particular object, but i thought it was something like:

dlib::video_capture cap;
cap.open("/dev/video0");
...

May 17 '21 06:05 pfeatherstone

That's exactly what I tried. There's a second boolean parameter is_rtsp, but I got errors no matter what I set it to:

can't open '/dev/video0' error : Invalid argument

May 17 '21 06:05 arrufat

I'll have a look at some point. Can't guarantee when though. It's possible this object is out of date. I haven't added nearly enough runtime checks with useful print statements. I'll compare at some point with what I've been using recently which definitely does work with capture devices.

May 17 '21 07:05 pfeatherstone

With regards to your error, i think that's an FFmpeg thing. Can you build ffmpeg from source using version v4.3.2 and try again? Maybe the v4l2 muxer/demuxer isn't enabled in your version of FFmpeg. These are the kind of things we want to check for at runtime to give the user good error messages. It's failing on avformat_open_input(), which is the first thing the implementation details should run, which they do, and it is doing so with the right arguments. Given that the error returned by libavformat is Invalid argument, i'm willing to bet it's got something to do with the, possibly old, installation of ffmpeg.

May 17 '21 11:05 pfeatherstone

I need to make some updates if we want this to work with ffmpeg v4.4 onwards, otherwise you will get compiler warning messages since ffmpeg have deprecated the use of av_init_packet() (and other function calls i'm currently using in this object). Instead you have to use av_packet_alloc(). So yeah, there are a whole bunch of updates i need to port, which i took care of in my local version, but haven't had time to merge into this dlib version. Again, I don't know when i'm gonna have time to finish this properly since a dlib version is no use to me since if have to work with YUV.

May 17 '21 11:05 pfeatherstone

Going back to installing ffmpeg from source, i would install it to a local directory, not system wide. A whole bunch of things depend on libav libraries and if they link to new libav libraries with subtle API changes, everything will break. Like, if you've installed opencv using sudo apt install libopencv-dev (or whatever), that will be linking to the default libav libraries that ship with your distro. If you overwrite those, then it will very likely break. So i've learnt to always install fresh libav libraries locally and carefully link to those instead of those provided by apt or yum. This is not a new problem, this is the case with pretty much every compiled library. But be warned, I had opencv segfault and couldn't understand why, and it was because it was linking to ffmpeg v4.3.2 when it was built with ffmpeg v<something very small>

May 17 '21 11:05 pfeatherstone

Or use a decent C++ package manager like vcpkg, conan, hunter or whatever. But i've never used them so can't say for sure if that's what you want. You're probably thinking, "I thought this would make my life easier, but rather than linking against bloated opencv, i now need to worry about linking to a good version of libavformat. Have I made any progress..." The answer is yes, coz soon you will have a nice script for building and installing ffmpeg from source with the right options you want, AND, you will be able to statically link everything, AND you won't be linking against 100 shared libraries you didn't know about, or care about, AND you will celebrate.

May 17 '21 12:05 pfeatherstone

Furthermore, you can tailor your ffmpeg build to only include the exact set of encoders, decoders, muxers, demuxers, protocols, filters and devices you strictly need for you app. I did this, and the resulting set of libav static libraries was tiny. Furthermore, after stripping symbols, I ended up with a 1MB binary, which these days is pretty small. So this gives you quite a lot of configurability, provided you know your way around building ffmpeg.

May 17 '21 12:05 pfeatherstone

Thanks for your explanation, and yes, don't worry, I won't mess my system with this stuff. I have FFmpeg 4.4 from the official Arch Linux repositories, and V4L support. I will try with 4.3 later.

May 18 '21 00:05 arrufat

If it still doesn't work, you can try calling avformat_open_input() on its own and check it passes. That's the very first thing that should be called.

May 18 '21 06:05 pfeatherstone

Also, try calling avdevice_register_all(); in main() somewhere at the start. This is a global initialization function for libavdevice. Sometimes this is required. I've only ever had to call this to enable ALSA stuff, which is used for audio. But it's possible that you need to call that in your environment to enable V4L.

You will have to add the following:

extern "C" {
#include <libavdevice/avdevice.h>
}

May 18 '21 10:05 pfeatherstone

If it turns out that is THE fix, then maybe we need to add some static initialization in the dlib wrapper. And at the same time, add the other global initialization functions. They are largely deprecated, but could still be required for older versions of libav libraries.

May 18 '21 10:05 pfeatherstone

Great, I will try that later, thank you for looking into this!

May 19 '21 09:05 arrufat

@pfeatherstone, I can open the webcam using those lines. Thank you. I am not familiar with the ffmpeg api, though (only with the command line program), I will have to study it a bit more to be able to do something useful with it...

May 21 '21 03:05 arrufat

Does the dlib object work now? Your error was failing at avformat_open_input() which is the very first thing dlib calls. So it should work right?