shoes3
shoes3 copied to clipboard
Performance maintenance release
Let's talk about an hypothetical 3.3.9 release focusing on performances.
- Profiling frenzy to find out areas to be improved.
- Improving drawing performances.
- Consider moving cairo code to skia.
- improving bitmap performances. (Ever used
image[]
? #285)
- Review caching performances and policies.
- May consider sqlite caching and in-memory cache.
- Improving UI responsiveness while performing tasks.
- Challenge: Ruby uses GIL
- Write a stress test that will outline performance issues and will also be used in subsequent releases.
References https://github.com/shoes/shoes3/wiki/Profiling https://en.wikipedia.org/wiki/Skia_Graphics_Engine https://en.wikipedia.org/wiki/Stress_testing_(software) https://developer.gnome.org/gdk3/stable/gdk3-Threads.html
Could you give an idea what a stress test should look like ? I may come up with something. How do you cache sqlite db in memory?
Could you give an idea what a stress test should look like ? I may come up with something.
A stress test is code that will intentionally abuse the good nature and features of Shoes. Here is a few examples of stress tests:
- Create and destroy a large amount of widgets (buttons and such), let's say a 1000. (GTK, Cocoa)
- Create and destroy a large amount of art (arc, ovals, rect, line, fill, etc), let's say 10,000. (Cairo)
- Create and destroy a large amount of text, let's say 10,000. (Cairo)
- Abusive slot manipulations (append, after, before, clear, etc). (Shoes)
- Abusive slot positioning (displace, hide, show, remove, move, scroll up/down, etc). (Shoes)
- Display several animations concurrently with fast FPS. (Shoes)
- A program that will keep Shoes UI busy to the point user input is kinda ignored.
- Etc...
How do you cache sqlite db in memory?
The current cache is on-disk where, for example, files or data are temporarily saved on disk. The problem with that is in the fact that creating, opening, accessing and closing files are costly operations.
A single SQLite database (cache.db
) would be opened at startup and closed when Shoes is closed. Anything to be cached would be inserted into some tables. It is also possible to ask SQLite to keep the database in-memory.
@BackOrder good. Some methods from the skillwheel can be useful as they comply with those requirements. We can adjust them to a state we need. Here is anexample. Block that makes an image work as a button. This is what happens when you interact with images on the skillwheel. @hovers is the custom popup but it can be removed if not necessary.
def set (img, options={}, &block )
img.hover { @hovers.show text: options[:text], header: options[:header] , size: 9, text2: options[:text2], width: options[:width], height: options[:height]; img.scale 1.25, 1.25 }
img.leave { @hovers.hide; img.scale 0.8, 0.8 }
img.click { @hovers.hide; block.call if block_given? }
end
set ( image "pics/misc/s_damage.png", left: 80, top: 11, width: 50 ), text: pane_text[4], width: 500, height: 40
A program that will keep Shoes UI busy to the point user input is kinda ignored.
This happens when Shoes is drawing. Unless you had anything else in mind cycling through a drawing pattern will be good enough.
Display several animations concurrently with fast FPS. (Shoes)
How can this be achieved ? Threading is not something that works flawlessly in Shoes.
This happens when Shoes is drawing. Unless you had something else in mind cycling through a drawing pattern will be good enough.
Many things keep Shoes busy. We just need sure way(s) to reproduce the problem.
Display several animations concurrently with fast FPS. (Shoes)
How can this be achieved ? Threading is not something that works flawlessly in Shoes.
Threading is a problem, RE: Challenge: Ruby uses GIL, GTK is not thread-safe.
Maybe have many arts being animated in one single animate(FPS) do ... end
would do. We could consider a large FPS number (60, 120, 1000). We could consider multiple Shoes.apps with their own animate
. Trial and error might get you to figure out what would be the best way to stress test this.
path-animation does clearly demonstrate how slow animation is. Now if one would want to write a game in Shoes, let's say a Super Mario World clone, that wouldn't go too well according to path-animation
. It might inspire you to come up with a stress test.
Here are multiple animations at once.
Shoes.app do
def anim i
animate(24) do |frame|
@counter[i].replace "FRAME #{i+1} #{frame*(i+1)}"
end
end
@counter = []
@counter << para("STARTING")
@counter << para("SECOND", left: 0, top: 30)
@counter << para("THIRD", left: 0, top: 60)
i = 0
3.times do
anim i
i+=1
end
end
@BackOrder regarding threads and drawings (probably animations too). Some time ago I found a way to cheat Shoes. Not sure how to explain it though. This does not work -> Putting a thread with all things inside would not work as we know because threads interfere with shoes processes for some reason. This works -> Putting a thread with drawing elements in start block will work because the thread launches shoes native process which does the drawing on behalf of the thread.
My second app has 3 tabs. One of the tabs is called "online store" when you click on "Update package list" the app makes an http query downloads the new stuff (if any) and shows them to the user. While the transaction is going "Loading" animation is shown to the user.
Usually when the transaction is undergoing the app is frozen until its completion but in this case the it is threaded so the user can have control over other app functions.
Here is a screenshot.
Here is the code sample responsible for it:
button("Update package list", left: 30, top: 10, width: 360, height: 20) do
check_dl == 0? nil : (messages 0; next)
@pack_contain.clear { spinner left: 113, top: 90, start: true, tooltip: "Waiting for something?" }
Thread.new do
repo_data = get_url @server_url
start do
repo_data.nil? ? ( messages 3 ) : ( File.open('NCF_repository/package_list.txt', "w") { |f| f.write repo_data } )
main_pack_block_online
end
end
end
@BackOrder regarding threads and drawings (probably animations too). Some time ago I found a way to cheat Shoes. Not sure how to explain it though.
Interesting approach. Also, this is not a stress test considering that Shoes do display without any efforts. There is something interesting though. I increased the FPS to one million and it's clear that it's peaking way before. Displaying the actual FPS (Frame Per Second) would tell us what the actual peak is.
NOTE: Shoes internal might be able to provide an FPS counter enabled in debugging mode.
This does not work -> Putting a thread with all things inside would not work as we know because threads interfere with shoes processes for some reason. This works -> Putting a thread with drawing elements in start block will work because the thread launches shoes native process which does the drawing on behalf of the thread.
Threads created in Ruby suffer from GIL (Global Interpreter Lock). It means they are in fact executed one at a time. No matter what you do it won't really work. Ruby team plan to fix this on Ruby 3.0. Might take a while. haha
RE: online store
You are correct. Downloading data usually freeze Shoes UI. It is normally not noticeable for small chunk of data but our tests on larger data (ISO file) demonstrated that it will completely blank the UI until download is finished. Also the thread doesn't do anything useful here (GIL again).
Additional reference: https://en.wikipedia.org/wiki/Stress_testing_(software)
Inspired from your code. The more counters, the lower the actual FPS. The more the FPS, the less it seems to make a difference? Sounds like there is frame dropping. You can profile and enjoy the results.
You should obtain the FPS you initially set up when you lower NUMBER_OF_COUNTERS
to, say, 10 or 15. Probably starts to lower significantly about 30 counters (animate) or so.
NUMBER_OF_COUNTERS = 150
FPS = 24
Shoes.app do
@counter = []
@text = []
NUMBER_OF_COUNTERS.times do |n|
@counter << Time.now
@text << para
animate(FPS) do |frames|
if (0 == (frames % FPS))
@text[n].text = "FPS #{ FPS / ((frames >= FPS) ? (Time.now - @counter[n]) : 1.0)}\n"
@counter[n] = Time.now
end
end
end
end
Hmm. Simple things have a crazy amount of to_s
and draw
calls. Roughly 8 times more calls than buttons.
NUMBER_OF_BUTTONS = 2000
Shoes.app do
NUMBER_OF_BUTTONS.times do |n|
button "[#{n}]"
end
end
An empty Shoes app looks like this:
Awesome :). this is about 9 to_s per cycle. but why? :/
We will know when 3.3.9 comes. For now we should slowly write stress tests and collect data.
It might be as simple as refreshing the window even when it doesn't need it. Or as complicated as adding widgets cause underlying hidden elements to refresh. Scrolling may also refresh everything even when not visible.
Hours of pleasure guaranteed.
GTK is not thread-safe.
Please be careful claiming this as a truth. It's not safe in certain situations (gthreads and its nuanced) - does shoes use those special situations - I don't think we do. Shoes one gtk thread (mainloop) Explain how the samples/simple/download.rb works when calling lib/shoes/download.rb if threading is unsafe. Ruby Threads however do have the GIL locking issue. If you really care about performance, then using newer Rubies is what you want.
A single SQLite database (cache.db) would be opened at startup and closed when Shoes is closed. Anything to be cached would be inserted into some tables. It is also possible to ask SQLite to keep the database in-memory.
Please - bench mark the simple sdbm key-value store used for shoes external image caches versus sqlite3, both inserts and fetches. Most images are cached in memory and never get loaded from external cache (and only once). Optimization requires knowledge, not speculation.
Hours of pleasure guaranteed.
Or you could do that.
RE: GTK thread safety
My mistake. I remember now that we simply need to tell GLib when we enter/leave the GTK thread.
If you really care about performance, then using newer Rubies is what you want.
Ruby performance is not the core of the problem and the difference in performance when upgrading from x.x.x to x.x.z is generally small. GIL is however a big problem. Threaded applications are considerably faster in jRuby.
Wouldn't agree that most of the bottleneck is in Shoes? It wasn't built with performance in mind.
Optimization requires knowledge, not speculation.
This issue is exactly where we build the said knowledge but it all start from hypothesis. Based on your feedback, a representative benchmark sdbm/sqlite would be the very thing to do.
Threaded applications are considerably faster in jRuby.
You are missing the important thing: Are Shoes 4 applications faster than Shoes 3.3.x? - that's your benchmark - not ruby vs jruby - we run Shoes.. You probably know that java swt uses gtk3 and cocoa just like Shoes 3 so they use cairo and pango too. You would also know they have to translate drawing from Shoes4 -> swt-> cairo/pango and they are fighting off by one
errors since the project started. You would also know that Shoes 4 & jRuby is a lot closer to the bleeding edge of Ruby versions than Shoes 3 is.
Benchmark properly with context that matters to you.
Is there a way to simulate hover without actually hovering with the mouse? It will be good for automation scripts.
I am not saying Shoes 4 is faster (or better) than Shoes 3. I am not suggesting to move to jRuby. I am saying threading in Ruby is useless. Any GUI application needs a way to effectively balance visual components and its tasks.
Benchmark properly with context that matters to you.
I am not expecting you to agree with everything I say. You say sdbm is fine? Alright, we can profile and benchmark the thing and see how it turns out. Maybe it is fine!
Does anyone have ever extensively profile and benchmark Shoes3? If nobody has, then nobody knows exactly and anything we say is speculative. So I did setup this issue as a conversation starter and to investigate in order to get all the answers that we need.
Is there a way to simulate hover without actually hovering with the mouse? It will be good for automation scripts.
@ccoupe has suggested to implement the ability to generate events #383. This would make it possible to programmatically move the mouse amongst other things.
I did build and profile Shoes 3 with the -gprof flag once, Might still be an option in the linux rakefiles. For the script I ran, most of the cpu time was inside Ruby and not Shoes/cairo/pango/gtk3 (you need a ruby with debugging info to do that). I also know that graphics performance is hardware constrained - by both the cpu and the gpu that gtk was built to support on that platform - not to mention disk speed if loading things.
I am saying threading in Ruby is useless
Perhaps you could look at samples/simple/download.rb and lib/shoes.download and see it threading is useless. It may not be all you want but it is working. Finding bottlenecks in performance is multi-dimensional - it's not easy to do properly and it's not easy to fix in code unless done properly. Wholesale code changes because you think something is slow and have a better idea is not a proper evaluation remember typhoeus?
Threading works for me though. My app does not freeze during the http requests when it is threaded.
Threading works for me though. My app does not freeze during the http requests when it is threaded.
@dredknight when a task performed by a thread is ever so small, you won't notice that it actually had to finish before moving to the next thread. You can test by yourself but nothing runs in parallel in Ruby.
RE: gprof flag
We should definitively make it happen again on all platforms.
RE: Ruby vs Shoes & friends
I get your point about time spent in Ruby versus Shoes & friends. Though let's not forget some of the Ruby classes and methods are defined in Shoes/C. Those we can improve on.
For example, one thing is clear is that there are too many draw calls. It's impossible to need that many calls. If we could somehow reduce the calls by, say, 20%, it might make a whole lot of difference to Shoes users.
I also know that graphics performance is hardware constrained - by both the cpu and the gpu that gtk was built to support on that platform - not to mention disk speed if loading things.
What would be your take on Cairo/Pango vs Skia? Skia is developed by Google and used on major web browsers and widely used applications. The C API is fairly similar to Cairo but is also more wholesome and actively developed.
RE: Typhoeus
Shoes directly using CURL in an independent C thread might have been a better solution but more work. To be fair, the tests with Typhoeus on Ruby alone were extremely promising. In the Shoes ecosystem? Not so much but still an improvement over the previous method (including dealing with https, right?) and advanced users get Typhoeus gem included with Shoes.
It should be noted that the difference between tests on Ruby alone versus Shoes should also tell us there is room for improvement in Shoes.
Finding bottlenecks in performance is multi-dimensional - it's not easy to do properly and it's not easy to fix in code unless done properly.
Absolutely true. You might have your own suspicions about what the bottlenecks are. How about you share with us and @dredknight and I work on some stress tests for those?
Listen, reading my initial post again clearly shows that things like caching was worded in a way that does not imply anything more than an investigation, e.g. "review, may consider". Maybe you misunderstood the purpose of this issue but this is really an investigative process. We get the tools we need, write the tests we need and investigate the bottlenecks before anything else happen.
Hopefully it sounds reasonable to you.
@BackOrder I believe it is not threaded but for some reason there is a visible difference with and without the thread. The code above without the thread.new end row simply does not execute the rotation animation (or at least this is what it look like because the user does not see the spinner).
I am currently finishing a few scripts for the app. After that I will have even more time to dedicate on building performance tools. this is something I am very keen to learn and get better at!
I get your point about time spent in Ruby versus Shoes & friends. Though let's not forget some of the Ruby classes and methods are defined in Shoes/C. Those we can improve on.
Improve on one or two of the lines of rb_call_something. That's not the problem.
For example, one thing is clear is that there are too many draw calls. It's impossible to need that many calls.
You should look at time in method , not counts. Windows/X/Cocoa compress multiple draws - have done so almost forever. Fascinating topic but nothing Shoes can or should touch.
What would be your take on Cairo/Pango vs Skia?
C++ ? Not me, I've suffered enough.
Hopefully it sounds reasonable to you.
No problem unless you are asking me to do the coding so you can explore. Low priority for me.
Improve on one or two of the lines of rb_call_something. That's not the problem.
Excellent!
You should look at time in method , not counts. Windows/X/Cocoa compress multiple draws - have done so almost forever. Fascinating topic but nothing Shoes can or should touch.
This is only true for system calls but not for Ruby calls (such as draw).
What would be your take on Cairo/Pango vs Skia?
C++ ? Not me, I've suffered enough.
Haha! It is written in C++ but it has a C API.
No problem unless you are asking me to do the coding so you can explore. Low priority for me.
Right. That's why it's suggested for a future release. It's not surprising that it is low priority for you because you also spend most of your time on Shoes/C. We can accept the performances of Ruby because Shoes never meant to be the fastest around the corner but there are few things that need to be addressed, such as UI responsiveness, slow display and animate.
The threshold on my machine seems to be 117, 118 images. Profiled for 5 seconds.
IMAGES = [
"shoes-icon.png",
"shoes-icon-blue.png",
"shoes-icon-federales.png",
"shoes-icon-red.png"
]
NUMBER_OF_IMAGES = 117
Shoes.app do
@images = []
@interpolator = (tmp = (0..50).collect { |n| -n }) + tmp.reverse
NUMBER_OF_IMAGES.times do
@images << image("#{DIR}/static/#{IMAGES.sample}")
end
@counter = 1
animate(@fps = 60) do |frame|
@images.each { |img| img.rotate(@interpolator.first) }
if ((frame / @fps) == @counter)
@counter += 1
@interpolator.push @interpolator.shift
end
end
end
Posting some more code testing the limits of Shoes. Brownian motion. 190 animated ovals is fine, 192 is not. Wondering why path-animation sample is performing so poorly.
NUMBER_OF_SHAPES = 192
Shoes.app do
@shapes = []
NUMBER_OF_SHAPES.times do
fill rgb(rand(255), rand(255), rand(255))
@shapes << oval(rand(self.width), rand(self.height), rand(100))
end
animate(60) do
@shapes.each do |shape|
mx = rand > 0.5 ? +1 : -1
my = rand > 0.5 ? +1 : -1
shape.move shape.left + mx, shape.top + my
end
end
end
@dredknight I created a branch performance on Shoes repo to avoid polluting this thread with too much stuff. Don't be shy to add stuff in Tests/performance.
More about Shoes and branches: https://github.com/shoes/shoes3/wiki/Git,-Github-and-Shoes
@ccoupe branch related instructions are working well. Good job.
Slot manipulations are surprisingly fast on widgets, texts and images. Significant performance decrease when introducing Shoes arts (cairo-based).
Be aware that Shoes decides how much time to give to Ruby vs Gtk event handling and it differs on Linux vs Windows. see shoes_app_g_poll() in shoes/native/gtk.c for Linux/BSD and shoes_native_loop() (same file) which happens to be the adjustment I made to dredknight cpu hogging bug. As @BackOrder remembers the Shoes 2 and 3.1 code took 100% of a core doing nothing and no one knew why so I moved Shoes Windows to use Gtk3. The Linux polling is _why's with some comments from me. Mysterious place.
Slot manipulations are surprisingly fast
My head hurts figuring out the statistics for each or the rand()s combinations in that script . You might need to run that for a minute or two. Also I believe animate has a small memory leak that I never found. There is also ruby gc at play.
My head hurts figuring out the statistics for each or the rand()s combinations in that script . You might need to run that for a minute or two. Also I believe animate has a small memory leak that I never found. There is also ruby gc at play.
I am open to alternatives and suggestions. Right now I am trying to figure out the tipping points of Shoes. Things that are abusive enough to cause disruption in Shoes but not enough to have it show a blank window. Later on we can come up with a clean set of stress tests.
Would you prefer caching random values in a YAML file then load up in an array? It's difficult to abuse Shoes without randomness but this approach could work. I could otherwise set a number of iterations for each operation. It is also possible to create animate-less code but it will show a different aspect, namely startup and setting UI. Sometimes Shoes takes some time to show up. That could be interesting.
All the cool things in Shoes go through animate. Eventually we need to fix this. What a coincidence that is issue #1 !
Running slot_manipulation for 5 minutes