grafanimate
grafanimate copied to clipboard
Wait for panels state=Done
Alternative method to use-panel-events and exposure-time options. I think it still needs ~0.2 second exposure time. Also not tested with older versions of Grafana.
I have been using grafanimate with some customizations to post videos of my Grafana to X/Twitter https://x.com/IntermittentNRG/status/1712826595977674850
Submitted for your consideration.
Hi @IntermittentNRG,
I would not have expected that this program is useful any longer, and I am very happy to hear that it apparently still works, now even better with your patch.
Thank you so much for submitting this improvement, I love it.
With kind regards, Andreas.
Thoughts
Alternative method to use-panel-events and exposure-time options.
Without validating your patch yet, if Grafana (now?) offers a getPanelData() method and a corresponding .state property for each panel, it is absolutely the right approach to inquire that, in order to find out about whether data loading has finished.
While I believe it works like your patch demonstrates it, I am thinking about if we could mount it at the place where the synchronization between the Grafana/JavaScript and Python domains happens, also getting rid of the busy/delayed Python loop, which is currently polling the whole stack.
Details
At those spots, we staged an event-based synchronization mechanism, which, under the hood, also uses polling, but is based on Marionette's marionette_driver.wait.Wait primitive.
It does the "all data loaded" check within the JavaScript domain, and emits a synthesized all-data-received event, which toggles the hasAllData state, which the Python domain is monitoring.
https://github.com/panodata/grafanimate/blob/7b68228247a962272565dec4084a01e1112e0ffa/grafanimate/grafana-studio.js#L214-L215
https://github.com/panodata/grafanimate/blob/7b68228247a962272565dec4084a01e1112e0ffa/grafanimate/grafana-studio.js#L231-L291
https://github.com/panodata/grafanimate/blob/7b68228247a962272565dec4084a01e1112e0ffa/grafanimate/grafana-studio.js#L158-L161
https://github.com/panodata/grafanimate/blob/7b68228247a962272565dec4084a01e1112e0ffa/grafanimate/grafana.py#L101-L115
Alternative method to use-panel-events and exposure-time options.
The current code does the "all data loaded" check within the JavaScript domain, and emits a synthesized
all-data-receivedevent.
I think if your code could fit there, and emits this event appropriately, we could get rid of the manual exposure timing in the Python domain completely, which is effectively a time.sleep.
https://github.com/panodata/grafanimate/blob/7b68228247a962272565dec4084a01e1112e0ffa/grafanimate/animations.py#L103-L105
By doing so, grafanimate will become both more robust, and efficient, like originally intended. [^1]
Do you think you may have the capacity to squeeze your code into that box, effectively replacing the ingredients of Grafana Studio's onDashboardRefresh? If you don't, please tell me, so I will pick it up on the next iteration.
[^1]: And somehow working on Grafana 5, IIRC. The original variant did not use any time.sleep() calls at all, and exclusively relied on Marionette's Wait synchronization primitive. That's how it should be.
It still works! But I think grafana-studio.js is mostly not doing anything after Grafana changed from Angular to React. There is still some support for Angular in Grafana so there are no errors, tho this will be removed in a later Grafana release.
I have made several more changes, but they're mostly hardcoded and need to be done as options.
I will look at your feedback and make changes to the PR.
Also what about support for old grafana? Should old use-panel-events and similar code be deleted to remove things that don't work or don't do anything in current grafana?
Hi again,
I think grafana-studio.js is mostly not doing anything after Grafana changed from Angular to React. There is still some support for Angular in Grafana so there are no errors, tho this will be removed in a later Grafana release.
I am not completely following to understand how grafanimate would properly work, if the code in grafana-studio.js does not work. Some of it may be optional, like styling the interface at runtime, but others, like opening and navigating to a dashboard, and driving the time range, is certainly not optional?
Also what about support for old grafana?
I don't think we need to be backwards-compatible. To be fair, we can run another maintenance release before bringing in breaking changes, so we can build upon that if there is demand.
Should old use-panel-events and similar code be deleted to remove things that don't work or don't do anything in current grafana?
I think it will be good to migrate the event handling to your proposal. Other things that don't work would also need to be modernized.
With kind regards, Andreas.
others, like opening and navigating to a dashboard, and driving the time range, is certainly not optional
Ok I glossed over those parts mainly looking at the styling. So I mean just the stuff that refers to elements and angular.
Also I will look at updating grafana-studio.js as suggested, but also it's kinda neat to keep the code mostly in python I think? But I understand your point why it's better in JS.
I will love to have most code in Python. But it is not a good idea to send JavaScript code from Python each time you want to invoke it, as it needs to be parsed and compiled each and every time. Better to load it into the browser in a regular way, using JavaScript, and invoke it using Python/Marionette.
In other words, grafana-studio.js is a minimal SDK supporting the Python code to be able to just call into it conveniently, nothing more.
Regarding grafana-studio.js. All CSS class names are now dynamic and will change when React components are updated. Discussed in grafana/grafana#71662
Thanks for the heads up, @intermittentnrg. We will need to find a different solution. Do you have any suggestions?
I came up with this selector for removing padding around panels:
$(".scrollbar-view > div").css("padding", "0");
Using :has() psuedo selector and similar trickery can also work, but I'm not sure it can work for all cases. My styling needs are simple tho.
The recommendation by Grafana is to use plugins? / adding files and rebuilding Grafana? Not really keen on this approach, I use their official docker image.
Anyway should we perhaps remove all broken css manipulation from grafana-studio.js? It can live on/die in git history.
The kiosk enabling doesn't work either but appending &kiosk=1 to query string works.
Hi @intermittentnrg. We will be happy to accept any patches to modernize grafanimate, making it compatible with recent versions of Grafana. We haven't been able to keep up with maintenance, and we are about to migrate it to https://github.com/grafana-toolbox. Let us know if we should add you as a collaborator on this project while we are already on the refactoring, so we can share future maintenance, if you like that idea.
Thank you! I will still need your advice. I have used python on and off over the years, but don't usually use it.
I'm now experiencing skipped frames in my videos, not entirely sure why but I suspect waiting for state=Done is not enough.
I've just setup a pipeline in my local Jenkins, turns out rendering a 1-2 minute video takes over an hour. But it's possibly slower now as it's running on a Raspberry Pi 4b.
Some failures from marionette/firefox is sometimes not booting correctly in container, unsure why.
Using scenarios-env.py that I created
def scenario():
return AnimationScenario(
grafana_url=os.environ["GRAFANA_URL"],
dashboard_uid=os.environ["DASHBOARD_UID"],
sequences=[
AnimationSequence(
start=parse(os.environ['START']),
stop=parse(os.environ['STOP']),
recurrence=RecurrenceInfo(
frequency=HOURLY,
interval=int(os.environ['STEP_HOURS']),
duration=timedelta(days=int(os.environ['WINDOW_DAYS'])),
#every=None
),
#every=None,
mode=SequencingMode.WINDOW
)
],
)
I've just setup a pipeline in my local Jenkins, turns out rendering a 1-2 minute video takes over an hour.
I guess it is because synchronization with Grafana is currently utterly broken, never has been stable, and we currently don't have a good solution. Or do we? Sorry if I lost track about any recent improvements from your pen, please educate me if I'm wrong.
Because synchronization is broken, rendering each frame will take so long because it will fully consume the timeout each time for each frame again. This makes usage unbearable.
If it works well on your workstation, but does not on your Raspberry Pi, it is yet another sign that this subsystem would need to be improved significantly. Until it is, please don't run it on a Pi.
I've just setup a pipeline in my local Jenkins, turns out rendering a 1-2 minute video takes over an hour.
I guess it is because synchronization with Grafana is currently utterly broken, never has been stable, and we currently don't have a good solution.
Another thing comes to mind: Didn't the code introduce manual exposure times some time ago? Anyway, all of that should probably not be discussed on behalf of this PR draft [^1]. a) Should it? b) Maybe GH-16 is more appropriate if it's the same thing you are observing?
[^1]: The thing is, when this PR will be merged or otherwise closed, this discussion, making up a part of improving synchronization matters, will be out-of-band to the other conversation, so things would become more fragmented.
I guess it is because synchronization with Grafana is currently utterly broken
Got it. Reading up on the conversation we had on this PR, I see that we may have stopped over at https://github.com/grafana-toolbox/grafanimate/pull/19#issuecomment-1763488227:
Do you think you may have the capacity to squeeze your code into that box, effectively replacing the ingredients of Grafana Studio's
onDashboardRefresh? If you don't, please tell me, so I will pick it up on the next iteration.
I have these 2 checks.
panels = Object.values(window.wrappedJSObject.grafanaRuntime.getPanelData())
return panels && panels.every(function(o) {return o?.state=='Done'})
return $('[aria-label="Panel loading bar"]').length == 0
But they maybe only check if data request is complete, and not that the canvas has been redrawn.
It mostly works! But I'm noticing some skipped frames when running on Pi, I have a kubernetes cluster with 4x Pi. And it's probably fine on my desktop PC. But if it works consistently on Pi then it must be fully solved right?
Could compare the screenshot image if it's different to previous? But probably not a nice way to do it.
Also had a lot of startup errors, timeout and JS errors. Just installed vnc server inside the docker image to see what's going on as logs and gecko.log weren't helpful...
Inspired by docker-seleinum which has this super useful VNC feature: https://github.com/SeleniumHQ/docker-selenium?tab=readme-ov-file#using-a-vnc-client
Just comparing the screenshot contents would be very reliable and simple.. Not sure if I like the idea tho.
https://stackoverflow.com/questions/34669068/how-to-verify-that-two-images-are-exactly-identical https://stackoverflow.com/questions/52736154/how-to-check-similarity-of-two-images-that-have-different-pixelization https://stackoverflow.com/questions/23982960/fast-and-efficient-way-to-detect-if-two-images-are-visually-identical-in-python