rfcs
rfcs copied to clipboard
Add RFC: Image slideshow design changes
Summary
Re-design the image slideshow to remove its current limitations.
Motivation
In its current implementation, the image slideshow can only play a low number of images. The user is not made aware of that fact, they just notice that only the first images are shown in the slideshow.
It should be able to show all of the selected images.
A minor issue that may be left out of this RFC is the way random playback is behaving. Currently it picks the next image completely random. This may result in some images getting shown multiple times while others may get shown seldom or not at all. The current implementation also suffers from modulo bias which means the depending on the number of images, the first few images may be "favored" by the algorithm and shown more often. The author's expectation for random playback is to play back the whole list once, but in random order (shuffled), thus each image is shown exactly once before the first repeat is seen.
Read RFC
It would be cool if we could support different image fit settings like in css: https://developer.mozilla.org/en-US/docs/Web/CSS/object-fit Might alleviate the need to look at the sizing of every image. This would require the image source to have a set size though. Which could work through the properties by setting a width and a height. The alternative would be to make a source that doesn't scale but is dynamically sized by the bounds.
Potentially a divisive opinion, but if we are going to make slideshow accept too many damned files then it should do everything async and should have minimal impact on startup. E.G. I think its fine if the shuffle on start is potentially poor quality (first 1/2/CACHE_SIZE images are always the same and it only shuffles from what its probed so far after obs has started), as long as everything starts up quickly.
By removing limits you let users do things like add 1k random images to their slideshow. Just stating these and doing nothing else is prohibitive on startup on typical spinning rust.
Opinion on streaming resources in general, Users doing recordings may also place their recordings on the same spinning rust as these files (all your media often goes together). So even having a streaming setup may cause undesirable disk usage. Similarly large images or fast slideshows will cause load on the system and its hard to surface these to the user. None of these are issues with a fully preloaded slideshow which will have no impact except on load of OBS where its impact is limited by design. Its hard to estimate impact on users since this is really dependent on what kind of images are being loaded, what the disk setup is like, how constrained the user system is, etc.... due to this so any such change is inherently risky.
Its hard to see much justification in taking the risk for more peoples recordings or streams to be impacted in ways that they cannot (or even us as supporters) have any hope of diagnosing. Currently we get a handful of people over the year trying to increase the limit usually and only by a bit. Or just switching to VLC/browser/etc and they are perfectly happy with it.
Potentially a divisive opinion, but if we are going to make slideshow accept too many damned files then it should do everything async and should have minimal impact on startup. E.G. I think its fine if the shuffle on start is potentially poor quality (first 1/2/CACHE_SIZE images are always the same and it only shuffles from what its probed so far after obs has started), as long as everything starts up quickly.
I agree that startup should also be async. Currently, I simply lack the understanding of whether it can be done in OBS. (Is a source allowed to changes its width/height dynamically? If so, how do I signal this?)
By removing limits you let users do things like add 1k random images to their slideshow. Just
stating these and doing nothing else is prohibitive on startup on typical spinning rust.
This may sound like a rant, but I'm just curious: What is it with everybody here being so freaked out by people wanting to add a lot of images to a slideshow? What use is a slideshow that can only show a few images? There are so many valid use-cases to having a few hundred, maybe even thousand, images in a slideshow in the stream. Are people here so freaked out simply because the current design can't handle it or is it something else?
Opinion on streaming resources in general, Users doing recordings may also place their recordings on the same spinning rust as these files (all your media often goes together). So even having a streaming setup may cause undesirable disk usage. Similarly large images or fast slideshows will cause load on the system and its hard to surface these to the user. None of these are issues with a fully preloaded slideshow which will have no impact except on load of OBS where its impact is limited by design. Its hard to estimate impact on users since this is really dependent on what kind of images are being loaded, what the disk setup is like, how constrained the user system is, etc.... due to this so any such change is inherently risky.
Its hard to see much justification in taking the risk for more peoples recordings or streams to be impacted in ways that they cannot (or even us as supporters) have any hope of diagnosing. Currently we get a handful of people over the year trying to increase the limit usually and only by a bit. Or just switching to VLC/browser/etc and they are perfectly happy with it.
The VLC plugin doesn't seem to be able to show images, or did I miss something here? How is the browser helping here in an easy way?
On one hand, you're very worried about the machines resources, on the other hand you recommend adding a lot of extra overhead to get around the slideshow limitations. I don't get that.
Quite frankly, I'm getting very frustrated. I am willing to donate my time and if you guys tell me which changes/features I need to implement to make it acceptable for you, I'd be happy to do it. But please, if you're going to leave the slideshow in its current, limited state, at least add an error message when the limit is hit! I've added a measly 100 images to the slideshow only to notice mid-stream that only about 15 or 20 were actually displayed. To me, an UI developer, that is a very serious issue: the user (me) was led to believe the slideshow works like expected (add images, no warning, ergo all of them should be displayed) but it didn't (silently discarded 80% of the images).
To be clear, it's good to have a debate about the design and bring up concerns. I'm sorry for the rant at the end of the previous post.
Designing features with the average user in mind can be very difficult. I encourage you to hang around in the Discord support channel for a few hours to get an idea of what we're afraid of here. It's not a matter of "if" but "when" people will throw too many images in the slideshow on an underpowered system, experience extreme performance degradation as a result, and storm into the support channel 5 seconds later. You sort of have to baby-proof everything.
I understand you're worried about degrading the experience/performance. And I understand the worries about creating issues that would be hard to debug. But I have an issue with this constant talk about "too many images". If the design is async, it should not matter how many images people add to the slideshow. Images that are too big would be more of an issue, though.
So, let's gets technical: what issues would need to be addressed by a design change to lift the current restrictions? Which are not yet mentioned by the RFC? How would async startup/config change work since the max width/height are needed?
On one hand, you're very worried about the machines resources, on the other hand you recommend adding a lot of extra overhead to get around the slideshow limitations. I don't get that.
The overhead in streaming these resources is trivial (to your computer), though certainly not to the programmer who is implementing them. But I think what you are picking up is that yes there are competing metrics of "good". There are users who want OBS to be as low impact as possible and users who want obs to have all the features regardless of performance impact.
To me, an UI developer, that is a very serious issue
I agree that is another issue that needs to be solved (preferably sooner rather than later). Its not clear if you add a huge directory only some of the images are loaded. Right now its more or less impossible for plugins to surface information like warnings about config which needs to be improved, but has been on the back burner for a long time.
Are people here so freaked out simply because the current design can't handle it or is it something else?
Because the proposed design also doesnt handle it. Generally OBS should load instantly (or feel like it does). One slideshow by itself with 20 images can stall out loading for hundreds of ms (why its already a troubled source). With the change it stalls out a relatively fast machine for 30s when adding a few hundred images. Any change that significantly or could significantly increase loading time for users will be scrutinized fairly heavily.
How would async startup/config change work since the max width/height are needed?
This is a good question, why are max width/height needed? If we ignore this what are the downsides to dynamically sizing? Can we make it easier to use a bounded transform for this source so users maintain the same feel when the source resizes dynamically?
--- edit
Maybe a good cache size would be equivalent to the existing image limit. That way if users were already bound by the limit with a new async mode they would fully cache all their images and no longer need to decode from disk. (given the way the limit works, many users may have more images in their sources but just not know they arnt showing, so maybe not too).
I'd like to know from some of the core devs about their feelings regarding the slideshow: is there a chance an asynchronous slideshow design is going to be accepted or do you consider it too risky?
So far I see three ways forward:
- Keep the slideshow in its current form.
- Find a way to inform the user when the image limit is hit. I think we agree that it's not good that the user doesn't "see" this limit until they notice images are not showing up.
- I could make the new slideshow a separate plugin, I guess.
- Change the design to be asynchronous.
- I would still need to figure out how to make the startup asynchronous but I don't want to invest any more time on this unless it's clear that this is something the project wants to have.
- Hybrid: provide both designs and let the user chose.
I understand that devs here are very worried about support. From that point of view option 1 is the safest approach. Of course I'm biased, I'd like to see option 2. My feeling is that the third option is the worst (have listed it for completeness): probably confusing for the user, increased maintenance burden.
Personally I would love to see the slideshow fixed. I know Jim gets apprehensive about this particular source, but it really should just work the way people expect it to work, and it sounds like the only way to make it work that way is to make it async. I can't speak much to the RFC and its design, but I am very much in favor of fixing its behavior.
Apologies for not responding right away because like most of the other RFCs this requires a bit of a writing.
I apologize for the whole slideshow thing. The primary problem with it is that any direction you go with it has unintended consequences. It's also sort of troublesome due to the internal design of the slideshow to use sources underneath, like image sources as you no doubt noticed. This was supposed to streamline the design originally but made things a bit more difficult to extend, thus many of the current problems. Problems are also multiplied by the types of images, sizes of images, and number of images people like to pull in to the slideshow. Additionally, another problem was the fact that it tries to pre-calculate its base size, which isn't possible with dynamic image counts. That was a design flaw from the beginning.
Also the number of unexpected applications people use slideshow for. I met people who would use it as a background cycling mechanism for their stream. They'd use hundreds of images for this and it acted sort of like the Windows desktop background cyling. I met people who wanted to use it to display artwork from viewers. I met people who would set the cycle speed to very fast, and used the slideshow to make animations. So many different applications that I never originally intended or thought of. All of these things are fairly valid use cases in my humble opinion.
Originally, the source had no memory limit, and it was designed to decompress and load all images in to virtual ram, which is quite limited. Then, inevitably people loaded too many images and started running in to out of memory crashes (both with RAM and VRAM) due to the slideshow.
So eventually we implemented a dynamic loading/unloading system to accommodate lots of files at once, but instead of one initial stall decoding all images on first load, it would stall every time a new image was loaded because people would inevitably load gigantic images which would cause continual decompression stalls. We're talking about stalls in the order of seconds, not milliseconds. Significant CPU usage. Stalls and CPU usage that should not be occurring for a program that needs to render frames within a ~16 millisecond period or less.
So, I reverted the dynamic system for the time being and just put it back the way it was, but instead implemented a memory limitation as it currently is. Again, this is just a "for the time being" measure until we discuss and figure out how to implement it in a way that can accommodate all use cases.
So the questions/issues become:
- How do we solve the multi-second stalls from loading huge compressed files? Implement some caching and/or automatic image resizing system?
- The slideshow itself has its own size, as you no doubt noticed. Rather than pre-calculating that size, it'd probably have to have an explicit width/height that has to be set by the user which all images would align themselves to. We already have this property in the slideshow but it's technically optional, which precalculating being the default. It'd probably have to default to 1920x1080 or something.
- Decompressing of images would have to be on a separate thread (asynchronous) as to not stall the graphics thread. The
gs_image_filestuff already separates this. It does not touch graphics system functions unless you explicitally callgs_image_file_update_texture, by which time the image is already decompressed and waiting to be used. - You'd probably want to avoid using an image source, and probably would want to use some sort of custom internal source so the image source can more easily access them. That way you can have like this one single dedicated thread to loading images rather than each image source trying to create its own thread or something. You definitely don't want the slideshow spawning hundreds of threads or something. All loading should be done ideally in one thread, decompressing images one-by-one in a sequence.
- Additionally, I want "random" to actually mean "random". Personally I don't want shuffle nor think it should be necessary. Last time a dynamic system was put in place, they replaced "random" with "shuffle" and it just kind of bothered me. Sure when you're caching the sequence currently in the cache probably can't be changed but new ones being dynamically loaded in to the cache should be random.
Anyway, as you can see the whole thing is just troublesome.
The question is not only how to make it load images dynamically, but how to make them load without impacting system performance. The current style is the absolute best for performance, and why it's still our current go-to. Sure there's an initial load time, but that load time happens precisely once per run, on startup. Once loaded, there is zero impact on performance because all data is already in video memory by then, so it simply becomes a matter of switching textures rather than decompressing/decoding.
One thing we could do is add an option to allow the user to choose the style if they prefer performance. "Preload images (with a memory limit)", "Dynamically load images (may incur additional CPU usage at runtime)", or maybe "Automatic" to have it detect.
One final word, if a new image slideshow is made, it would also need to be marked as a version 2, and the old one would have to be marked as OBS_SOURCE_CAP_OBSOLETE. See color-source.c for an example of source versioning.
Anyway, hopefully I covered most of the issues with it, the history behind it, and why this whole thing is frustrating. The topic gives me a lot of frustration because it's a lot more nuanced and complicated than people realize, and people use the image slideshow for a lot more things than people realize. And it has way more potential of an impact to CPU usage on the user's system than people realize.
Thank you a lot for the very detailed reply and history!
- How do we solve the multi-second stalls from loading huge compressed files? Implement some caching and/or automatic image resizing system?
That is what I intend to solve with a cache/worker thread, which pre-loads n images (my current implementation has n = 2) and keeps last few n images.
- The slideshow itself has its own size, as you no doubt noticed. Rather than pre-calculating that size, it'd probably have to have an explicit width/height that has to be set by the user which all images would align themselves to. We already have this property in the slideshow but it's technically optional, which precalculating being the default. It'd probably have to default to 1920x1080 or something.
Having to specify a size would indeed solve the issue that had me worried the most. This would get rid of having to look at all images, thus completely eliminating the stall at the start.
- Decompressing of images would have to be on a separate thread (asynchronous) as to not stall the graphics thread. The
gs_image_filestuff already separates this. It does not touch graphics system functions unless you explicitally callgs_image_file_update_texture, by which time the image is already decompressed and waiting to be used.- You'd probably want to avoid using an image source, and probably would want to use some sort of custom internal source so the image source can more easily access them. That way you can have like this one single dedicated thread to loading images rather than each image source trying to create its own thread or something. You definitely don't want the slideshow spawning hundreds of threads or something. All loading should be done ideally in one thread, decompressing images one-by-one in a sequence.
The cache/worker thread of my implementation/proposal currently creates and destroys image sources as needed. That does load the image on that thread and not involve any other threads, doesn't it? No other threads are spawned.
Is avoiding the image source really necessary? As far as I can see, it's actually quite elegant to use them, especially in conjunction with transitions.
- Additionally, I want "random" to actually mean "random". Personally I don't want shuffle nor think it should be necessary. Last time a dynamic system was put in place, they replaced "random" with "shuffle" and it just kind of bothered me. Sure when you're caching the sequence currently in the cache probably can't be changed but new ones being dynamically loaded in to the cache should be random.
Can do.
One thing we could do is add an option to allow the user to choose the style if they prefer performance. "Preload images (with a memory limit)", "Dynamically load images (may incur additional CPU usage at runtime)", or maybe "Automatic" to have it detect.
I'm going to need to think about that some more to find a design that can support both.
One final word, if a new image slideshow is made, it would also need to be marked as a version 2, and the old one would have to be marked as
OBS_SOURCE_CAP_OBSOLETE. Seecolor-source.cfor an example of source versioning.
Will do.
Anyway, hopefully I covered most of the issues with it, the history behind it, and why this whole thing is frustrating. The topic gives me a lot of frustration because it's a lot more nuanced and complicated than people realize, and people use the image slideshow for a lot more things than people realize. And it has way more potential of an impact to CPU usage on the user's system than people realize.
I understand.
My time is currently more limited, unfortunately, but I'm going to overhaul both the RFC and my slideshow implementation with your feedback in the coming weeks.
I suppose I'm getting a bit ahead of myself. It's fine to use the image source.
The only other thing was when I mentioned caching in regards to loading large files -- in that case I meant like hard drive caching in temporary storage or something (this is separate from the caching of the open files). Maybe decode once to the hard drive so you never have to decode those files again or something, or decode and downsize, then cache it on the hard drive in a temporary file. Because 1-2 seconds of a core being eaten up seems kind of ridiculous. I'm not really sure what to do there.
As for the randomization I suppose I don't care about it that much either.
I guess I'm just mostly concerned about CPU usage and occasionally eating up lots of CPU for decoding.
I finally had some time to continue working on this. I now have a design that works with both preload and on-demand, together with an implementation. The updated design has a video tick that works the same for both modes.
Current implementation can be found at https://github.com/DarkDust/obs-studio/blob/feature/slideshow2a/plugins/image-source/obs-slideshow2.c It's work-in-progress, I intend to refactor this more to better group things that belong together and needs more testing.
Comments/critique/suggestions welcome, of course.
Refactoring, fixing bugs and testing is complete. I've used the slideshow in on-demand mode for several hours, finally streamed 4.5h yesterday with it (actual stream with audience) and it works as intended. Preload mode also works correctly but I haven't tested as much with it (given its logic is much simpler).
Current implementation can be found at https://github.com/DarkDust/obs-studio/blob/feature/slideshow2a/plugins/image-source/obs-slideshow2.c It's work-in-progress, I intend to refactor this more to better group things that belong together and needs more testing.<
@DarkDust How do I implement this as a newbie? Would be grateful for any instructions.
@DarkDust How do I implement this as a newbie? Would be grateful for any instructions.
You need to checkout the branch https://github.com/DarkDust/obs-studio/tree/feature/slideshow2a and then follow the instructions for building OBS from source. No binaries, sorry.
So there seems to be people wanting this other than me (see obsproject/obs-studio#3366). There's a detailed RFC, there's a working implementation. What needs to be done to get this merged?
Be patient until we have time to review.
@Fenrirthviti: Waiting for 10 months before asking about the status is pretty patient, no? I'm just an outsider, this is an open-source project which is done by volunteers in their free time (AFAIK except for one) so you do have my sympathies. I get it, the core people working on OBS have different priorities. But I still find that comment of yours to be rude. I've invested several days of work to make a solution that hopefully meets your quality standards, while I had an initial solution that worked fine for me in just one hour. Everything I did on top was to help other people also get an improved slideshow.
The comment was not intended to be rude, but was perhaps a little blunt without explanation, it was more intended to let you know that we know this is here and will get to it when time allows. OBS is currently maintained by a single full time developer, which means their time is extremely limited and valuable, and everyone else helps out as they have the free time to. This is fairly low on the priority list with how few people are using the feature, and there has been a lot of discussion and disagreement on how exactly to properly implement this feature. So, what needs to happen is that person needs to find time to do a proper review and provide feedback. A lot of the limitations that the current image slidehow has were implemented because things didn't actually work in the wild as expected, and it was necessary to prevent crippling people who were doing silly things (like adding 1000 8k images and not understanding what that actually meant). I can appreciate the work done to help out, but this RFC has not been finalized so the implementation might not be accepted as-is and further changes might be required. While not a bad idea to start work on something before the RFC is accepted, it should always be with the understanding that those changes might not be accepted.
I suppose the best thing to do right now is summarize exactly what changes were made in your implementation and how they match what is in the current RFC, to make the review easier to parse when the time comes. Additionally, getting a viable build available for testing would go a long way as people can then provide feedback without having to set up a new build environment. Our CI scripts should be able to provide builds on your fork with minor changes if you are up to date on our current master branch.
Thank you very much for this clarification!
My implementation should match the RFC very closely, but I'll double check and also write down the differences to the current implementation in a few days.
Notes for reviewers
The RFC is not that long, I guess, but here are the main points:
- The slideshow has two modes:
- Preload mode is supposed to work exactly like the current slideshow. Images are loaded immediately until the memory limit is hit, subsequent files are ignored. (Unsolved problem, IMHO: users should be notified about that.)
- On Demand mode is a multi-threaded mode that manages a cache, it loads and unloads the required images in a background thread. It pre-loads a few images in advance to make sure that usually the next image is already loaded by the time it's needed.
- The design allows for both modes to use the same data structures.
- Render function:
- Unchanged: it simply displays a transition.
- Tick function:
- For preload mode it should behave exactly the same as before.
- For on demand, if it's time to transition to the next image, it manages which images should be in the cache and wakes up the worker thread. An image is taken from the cache and assigned to the transition if available, otherwise it retries on the next tick. The goal is to do as few work here as possible, all expensive operations should be done on the cache/worker thread.
- Worker thread: new for on demand mode. Manages the cache content according to the list of images that should be in the cache. Missing images are loaded, unneeded images are unloaded.
My current implementation implements the RFC completely. Some remarks about the implementation:
- I tried to group functions that belong together, like index calculations or managing cache entries. Hopefully it's easy to navigate. Still, its overall structure is not very different from the current slideshow implementation.
- In the current slideshow, managing the
cur_itemwas straight-forward. In my implementations, these operations were moved into functions andcur_itemshould only be manipulated by those utility functions likeset_cur_itemorset_next_item. That's because for the on demand mode, additional work is necessary to manage which items should be in the cache. - Multi-threading:
- A mutex is used to protect variables that are shared across threads.
- Care has been taken to not hold the mutex for any expensive work, only as few operations as possible should be done while the mutex is locked.
- For debugging and to mark that a function must always be called with the mutex locked, these function start with
assert(ss->have_mutex);(it's not possible to correctly assert that the mutex is not locked). This means the mutex must be handled vialock_mutex()andunlock_mutex(), never directly viapthread_mutex_lock/pthread_mutex_unlock. - In the main struct, variables that need the lock are grouped together with a comment. It might be a good idea to make this more obvious/verbose via a common variable name prefix or grouping them into a helper struct.
For those still suffering this undocumented oversight, that was apparently solved by @DarkDust in 2021, and even formatted for the convenience of the reviewers -- but still remains unimplemented... I'm using this to get it done:
https://obsproject.com/forum/resources/browser-image-slideshow.852/
Takes some work copy/pasting and configuring to get it setup, but this is exactly the implementation I was about to code up in javascript and CSS, just using a Browser source. Somebody else already coded it and even threw in a LUA implementation that allows easy configuration changes and image-list refreshes from within OBS!
Project opted for a different solution, this PR is obsolete.
For those still suffering this undocumented oversight, that was apparently solved by @DarkDust in 2021, and even formatted for the convenience of the reviewers -- but still remains unimplemented... I'm using this to get it done:
https://obsproject.com/forum/resources/browser-image-slideshow.852/
Takes some work copy/pasting and configuring to get it setup, but this is exactly the implementation I was about to code up in javascript and CSS, just using a Browser source. Somebody else already coded it and even threw in a LUA implementation that allows easy configuration changes and image-list refreshes from within OBS!
thank you, I'm still dealing with this issue & that plugin looks great