Video via `gstreamer-rs` or ffmpeg
The current API i've sketched up looks like the following.
pub enum LoopMode {
Normal,
Reverse,
Palindrome,
}
pub struct VideoPlayer {}
impl VideoPlayer {
/// Construct a new shm module with an initial size
pub fn new() -> Self {
VideoPlayer {}
}
/// Loads a video from disk, takes a string representing the absolute path to where the video is located
pub fn load(&self, path: String) {
unimplemented!();
}
/// Loads a video from the internet, takes a string representing the url link to where the video is hosted online
pub fn load_url(&self, path: String) {
unimplemented!();
}
/// Starts the playback of the video
pub fn play(&self) {
unimplemented!();
}
/// Pauses the current playback of the video
pub fn pause(&self) {
unimplemented!();
}
/// Stops playback of the currently playing video
pub fn stop(&self) {
unimplemented!();
}
/// Closes and frees up the video player from memory
pub fn close(&self) {
unimplemented!();
}
/// Defines the playback mode of the video.
/// 1. either loops forwards from the beginning of the video,
/// 2. plays the video in reverse,
/// 3. plays the video from the start and then plays back in reverse once the video has reached it's final frame
pub fn set_loop_mode(&self, loop_mode: LoopMode) {
unimplemented!();
}
/// Sets the volume of the video. takes a normalized value between 0.0 and 1.0
pub fn set_volume(&self, volume: f32) {
unimplemented!();
}
/// Sets the playhead position of the video as a percentage of its total length.
/// e.g. a value of 0.5 sets playback to 50% through the current length of the video
pub fn set_position(&self, pct: f32) {
unimplemented!();
}
/// Sets the playhead position to a specific frame
pub fn set_position_frames(&self, frame: f32) {
unimplemented!();
}
/// Returns the current percentage through the video relative to the videos playhead position
pub fn position(&self) -> f32 {
unimplemented!();
}
/// Returns the current frame of the videos playhead position
pub fn current_frame(&self) -> f32 {
unimplemented!();
}
/// Returns true is the video has finished playing through every frame in the currently loaded video, otherwise returns false
pub fn is_finished(&self) -> bool {
unimplemented!();
}
}
Currently blocked on #208 in order to determine the best option for interop with vulkan framebuffers.
Hey is it possible to call close() in the drop so that you can't forget. RAII
Yep that's a great idea @freesig
As part of this problem we need a way to get the frames from the vulkan pipeline. I need this upstream for a project but I thought it might be cool to port the Sascha Willems screen shot example to a nannou vk_example. He basically just copy's the frame to a vkImage that is in Linear tiling.
I'm not sure if this is the most efficient way to get data back from the gpu though.
Other options are:
- Using a ring buffer of images
- Usings a ring buffer of buffers. I'm not sure if this is possible though because you need to get it out of
Optimaltiling otherwise it's going to be useless data - One of the above approaches combined with #279
I'm going to start with a vk_screenshot then make a vk_screenshot_sequence where I can benchmark the different approaches. This is one of the few vulkan thing we can bench mark due to it being a round trip.
For vk_screenshot_sequence I will try and implement something similar to this that allows holding down a screen shot button to record a simple av1 video.
Hey @freesig Maybe you would be interested in this issue I opened about Vulkan support on the Gstreamer Gitlab.
Looks like Vulkan support needs to be added into Gstreamer itself first and then bindings can be generated for us to call from Nannou. Would be awesome to get this going and we might actually need this for another upstream commercial project soon.
It should be possible to grab the frame from the gpu pipeline and then pass it to Gstreamer without Gstreamer knowing anything about vulkan. I guess if Gstreamer has direct access to the vulkan pipeline it could avoid a copy from your application into Gstreamer. I don't fully understand what does yet though. For these examples I'm just going to focus on getting the frame from the gpu pipeline to application memory so we can do whatever we want with it. Should be useful atleast for screenshots and gifs / simple short videos for instagram
Apart from the satisfying the above API in a fast way the other requirements are: Playback
- Needs to support vulkan in the following ways
- Decode video and audio from compressed format and play directly to a vulkan context. The frame is never copied into nannous memory and instead straight to the gpu. This would be the most efficent way to play video but you couldn't modify it.
- Video frames are sent directly to a vulkan buffer so we can use video in our pipeline. This would allow use to add effects etc.
- Video frames are sent to application memory. This doesn't have anything to do with vulkan because we have access to the raw buffer and can simply copy into a vulkan buffer if we need to. But it's slower because theres an additional copy from the streaming libraries memory to our applications memory.
Record
- Access the vulkan swapchain directly so that it can be copied directly to the streamers memory and then saved to disk or streamed over a network.
Ideas I'm a bit confused here as to why the streaming library needs to support vulkan. In the above use cases could we not just access the memory directly from the streamer like:
cpu_buffer_pool.next(my_cool_streamer.current_frame())
// Where current_frame() returns &[[u8; 4]]
This way there is no additional copy into our memory. The same could be done with recording. Am I missing something here?
Perhaps its worth trying to organise a meeting sometime with the rust-av developers to see where they're at, their current progress, where they see the project going etc? This might help us to better learn the boundaries at which we can expect to interface with decoders/encoders for reading/writing video files and might also get a chance to express our use case to them.
That's a good idea. The only thing is it doesn't look like vulkan / vulkano is on their radar at the moment. But maybe they are just focusing on the encode / decode steps and not the rendering.
I'm going to keep posting in this issue with the challenges I'm hitting with screenshot because I think being able to get hold of the frame data is a vital step in this process but I'm happy to open another issue if you think it's a bit off topic. I know that @MacTuitui is pretty interested in having this ability. I assume they are rendering out there sketches to video in 0.8 already.
Progress so far
- [x] Take a screen shot and save it.
- [x] Currently using the
CpuAccessibleBufferto copy the frame from the swapchain image. Which is apparently slow but I can't for the life of me figure out how to read from a CpuBufferPool. Ok I guess this answers my question.
// TODO: Add `CpuBufferPoolSubbuffer::read` to read the content of a subbuffer.
// But that's hard to do because we must prevent `increase_gpu_lock` from working while a
// a buffer is locked
- [ ] The other option would be to use a image in linear tiling although vulkano does not expose this and from what I've read this is slower then a buffer. It would be good to test this though.
- [ ] Blit the colours because they are currently BRG instead of RGB. I think that I need to check if blit is supported and then do a blit or fallback to a copy and manually swap the colour channels. Also Bliting requires an image in linear tiling.
- [ ] It might mean we need to write our own buffer type optimized for this.
As it might be relevant (if anyone wants to create any video from nannou apps), my workflow is as follows so far:
At the end of view() I have the following:
if app.elapsed_frames() > START_FRAME {
let image: nannou::glium::texture::RawImage2d<u8> = app.main_window().read_front_buffer();
let image = nannou::image::ImageBuffer::from_raw(image.width, image.height, image.data.into_owned()).unwrap();
let image = nannou::image::DynamicImage::ImageRgba8(image).flipv();
image.save(format!("frame-{:04}.png", app.elapsed_frames()-START_FRAME)).unwrap();
}
if app.elapsed_frames() >= LENGTH_FRAME+START_FRAME {
exit(0);
}
I then use ffmpeg to encode to a "suitable size":
ffmpeg -r 60 -i frame-%04d.png -pix_fmt yuv420p -s 1024x1024 190514.mp4
The issue I have right now is that the image crate does not support tiff (or I could not figure out how to save to tiff), so there's a huge hit in performance by compressing to png. This implies that all animations must use app.elapsed_frames() as master clock, and that you can't have interactive elements while recording (but to be fair, I am not aiming at real-time stuff most days, so it's slow from the start). Less than ideal, but no surprises in terms of quality: you get the whole frames, lossless pngs, in order. I'd rather have a way to have lossless images fast (that might require post-processing) than a way to save to video directly but with less quality.
What I was doing in the past (with my custom scala framework based on Processing) and I might end up trying here as well, is to have the convert to image part of the process being split to multiple threads (queue the images and process them in subthreads).
And of course, the bottleneck is as stated before, the way to get hold of the frame data (here done by read_front_buffer()) that is not here in v0.9. So for me I'm kind of stuck until I have a way to export videos from v0.9.
Nice one @MacTuitui This is pretty much the work flow I want as well. It would be nice to be able to choose from multiple different formats like (mp4, InstagramVideo, InstagramImage, YouTube, mov, png, tiff, jpg etc.)
Ideas:
- You could run the saving on another thread so it doesn't block you're realtime stuff (I will do this in the example to demonstrate. However this might mean that you're submitting frames to save at a much higher rate then they can be processed.
- Ultimately it would be better have the video processing in nannou so that there's no png step required but there is a bit of work to get to this satge.
- With vulkan you will need to do the copies in graphics command buffer and not the view. This means that it won't be available till the draw is completed (the next frame). I've been reading from the buffer (equivalent to
read_front_buffer()) in theupdate(). - It looks like the image crate currently supports ico, jpg, jpeg, png, ppm and bmp. I'm not an expert on image storage but I think maybe bmp might not be compressed.
I should have an example up by the end of today that you can use to atleast get a similar workflow to this in v0.9 and then we can go from there to make it easier / faster / more formats
I've hit a bit of a bottleneck with this. All the image / video encoding libraries I've looked at like to take a frame as a &[u8] or something similar. Basically a slice of data. This is fine if your pipeline is in the correct format because you can just expose the image or buffer as a slice.
However if you need to process it in anyway eg. u16 -> u8 or BGR -> RGB then you need to create an iterator Rust is forcing you to loop through the whole frame data to collect it so that you can parse it into the function as a slice.
Does anyone know of any (possibly unsafe) way to get around this problem?
Some libraries let you take a ColorType and presumably do this conversion in some efficient way. Unfortunately the PNG library doesn't support BGR
tutorial
After playing with all day I think it might be better to just write to a storage buffer and do the conversion in the fragment shader. This will at least make it possible to use ffmpeg with yuv format which should be fast
After experimenting with this problem over the last few days I believe the best option is to move forward with gstreamer itegration.
Reasons
- There is no simple / fast way that I could find to encode video from a raw frame. (The closest I got was
ffmpeg rawvideobut the results we pretty bad and slow) - RustAV looks to be in very early stages and waon't be usable for a very long time.
- Colour space transitions and encoding are really complex. There is basically an implementation per codec and only some support gpu acceleration. Gstreamer already has this included and is actively developed with anything new that comes out.
- Gstreamer is the standard. Before I started this issue I had no idea how complex video streaming was and the amount of work that has gone into it. It would be wasteful to try and make our own implementation.
- Gstreamer is agnostic to everything so it doesn't tie us into opengl.
How to move forward
- [ ] Implement gstreamer example that saves raw frames to a chosen video format (This is possible see this)
- [ ] Implement gstreamer example that decodes to raw frames from a chosen video format
- [ ] Once gstreamer gets proper vulkan support then we can tie it more into our pipeline. There is already a vulkan sink though.
- [ ] Contribute to gstream vulkan integration
- [ ] Write some clear docs on which formats to use for video. Some formats on certain systems support hardware encode/decode. This makes them way faster then cpu encode/decode so we need a way to help nannou uses make the right choice. Perhaps we could even have a query function that returns available formats on a system ranked for speed or quality etc.
Hey guys, I just stumbled into your awesome project via the recent reddit post announcing 0.9, and saw the request for input on your why nannou roadmap, and thought I'd just drop some thoughts here.
I work on a project that handles video playback via opencv-rust image buffers (which, under the hood, uses any number of video decoding backends including gstreamer, ffmpeg, v4l, etc) by naively sending them to an OpenGL texture and drawing it on a full-screen quad. Performance is very good, even on my "legacy" box without an integrated GPU. I don't use them, but one could easily do video processing on the displayed video using shaders.
You can see some of the code I have going at https://gitlab.com/ajyoon/spectrophone/blob/055cdd3a3adb9492e249d7203f61d586ef57a2c1/src/gui.rs
I'm inclined to agree that gstreamer is a good choice for this project - it's mature and not byzantine in scope. OpenCV would work, but today the rust bindings (to which I occasionally contribute) are fragile and difficult to integrate and would cause too much build trouble for the "plug and play" approach I think you guys are going for.
Regarding concerns I'm reading here about extra overhead from having frame buffers do an extra copy step in memory - I don't think such concerns are really warranted. A well-written memcpy invokation should make the copying overhead trivial against the decoding, processing, and display costs. Frames from video data generally have a fixed, known-ahead-of-time size, so raw memcpy shouldn't be an issue.
Hope some of this may be useful! The stack I'm using varies significantly from what nannou is using, so if these experiences aren't valuable feel free to disregard.
For future reference, a nice write-up on using ffmpeg for video playback in python:
http://zulko.github.io/blog/2013/09/27/read-and-write-video-frames-in-python-using-ffmpeg/
Hi, do you have any news about the implementation of this feature on nannou?
We are working on a major refactor of Nannou right now. Although we don't have immediate plans to tackle video after that, it's very high on my personal list. So no promises, but it's definitely still a priority.