grafanimate icon indicating copy to clipboard operation
grafanimate copied to clipboard

Wait for panels state=Done

Open intermittentnrg opened this issue 2 years ago • 19 comments

Alternative method to use-panel-events and exposure-time options. I think it still needs ~0.2 second exposure time. Also not tested with older versions of Grafana.

I have been using grafanimate with some customizations to post videos of my Grafana to X/Twitter https://x.com/IntermittentNRG/status/1712826595977674850

Submitted for your consideration.

intermittentnrg avatar Oct 15 '23 18:10 intermittentnrg

Hi @IntermittentNRG,

I would not have expected that this program is useful any longer, and I am very happy to hear that it apparently still works, now even better with your patch.

Thank you so much for submitting this improvement, I love it.

With kind regards, Andreas.

amotl avatar Oct 15 '23 19:10 amotl

Thoughts

Alternative method to use-panel-events and exposure-time options.

Without validating your patch yet, if Grafana (now?) offers a getPanelData() method and a corresponding .state property for each panel, it is absolutely the right approach to inquire that, in order to find out about whether data loading has finished.

While I believe it works like your patch demonstrates it, I am thinking about if we could mount it at the place where the synchronization between the Grafana/JavaScript and Python domains happens, also getting rid of the busy/delayed Python loop, which is currently polling the whole stack.

Details

At those spots, we staged an event-based synchronization mechanism, which, under the hood, also uses polling, but is based on Marionette's marionette_driver.wait.Wait primitive.

It does the "all data loaded" check within the JavaScript domain, and emits a synthesized all-data-received event, which toggles the hasAllData state, which the Python domain is monitoring.

https://github.com/panodata/grafanimate/blob/7b68228247a962272565dec4084a01e1112e0ffa/grafanimate/grafana-studio.js#L214-L215

https://github.com/panodata/grafanimate/blob/7b68228247a962272565dec4084a01e1112e0ffa/grafanimate/grafana-studio.js#L231-L291

https://github.com/panodata/grafanimate/blob/7b68228247a962272565dec4084a01e1112e0ffa/grafanimate/grafana-studio.js#L158-L161

https://github.com/panodata/grafanimate/blob/7b68228247a962272565dec4084a01e1112e0ffa/grafanimate/grafana.py#L101-L115

amotl avatar Oct 15 '23 19:10 amotl

Alternative method to use-panel-events and exposure-time options.

The current code does the "all data loaded" check within the JavaScript domain, and emits a synthesized all-data-received event.

I think if your code could fit there, and emits this event appropriately, we could get rid of the manual exposure timing in the Python domain completely, which is effectively a time.sleep.

https://github.com/panodata/grafanimate/blob/7b68228247a962272565dec4084a01e1112e0ffa/grafanimate/animations.py#L103-L105

By doing so, grafanimate will become both more robust, and efficient, like originally intended. [^1]

Do you think you may have the capacity to squeeze your code into that box, effectively replacing the ingredients of Grafana Studio's onDashboardRefresh? If you don't, please tell me, so I will pick it up on the next iteration.

[^1]: And somehow working on Grafana 5, IIRC. The original variant did not use any time.sleep() calls at all, and exclusively relied on Marionette's Wait synchronization primitive. That's how it should be.

amotl avatar Oct 15 '23 19:10 amotl

It still works! But I think grafana-studio.js is mostly not doing anything after Grafana changed from Angular to React. There is still some support for Angular in Grafana so there are no errors, tho this will be removed in a later Grafana release.

I have made several more changes, but they're mostly hardcoded and need to be done as options.

I will look at your feedback and make changes to the PR.

intermittentnrg avatar Oct 16 '23 15:10 intermittentnrg

Also what about support for old grafana? Should old use-panel-events and similar code be deleted to remove things that don't work or don't do anything in current grafana?

intermittentnrg avatar Oct 16 '23 17:10 intermittentnrg

Hi again,

I think grafana-studio.js is mostly not doing anything after Grafana changed from Angular to React. There is still some support for Angular in Grafana so there are no errors, tho this will be removed in a later Grafana release.

I am not completely following to understand how grafanimate would properly work, if the code in grafana-studio.js does not work. Some of it may be optional, like styling the interface at runtime, but others, like opening and navigating to a dashboard, and driving the time range, is certainly not optional?

Also what about support for old grafana?

I don't think we need to be backwards-compatible. To be fair, we can run another maintenance release before bringing in breaking changes, so we can build upon that if there is demand.

Should old use-panel-events and similar code be deleted to remove things that don't work or don't do anything in current grafana?

I think it will be good to migrate the event handling to your proposal. Other things that don't work would also need to be modernized.

With kind regards, Andreas.

amotl avatar Oct 16 '23 22:10 amotl

others, like opening and navigating to a dashboard, and driving the time range, is certainly not optional

Ok I glossed over those parts mainly looking at the styling. So I mean just the stuff that refers to elements and angular.

Also I will look at updating grafana-studio.js as suggested, but also it's kinda neat to keep the code mostly in python I think? But I understand your point why it's better in JS.

intermittentnrg avatar Oct 16 '23 23:10 intermittentnrg

I will love to have most code in Python. But it is not a good idea to send JavaScript code from Python each time you want to invoke it, as it needs to be parsed and compiled each and every time. Better to load it into the browser in a regular way, using JavaScript, and invoke it using Python/Marionette.

In other words, grafana-studio.js is a minimal SDK supporting the Python code to be able to just call into it conveniently, nothing more.

amotl avatar Oct 16 '23 23:10 amotl

Regarding grafana-studio.js. All CSS class names are now dynamic and will change when React components are updated. Discussed in grafana/grafana#71662

intermittentnrg avatar Apr 20 '24 22:04 intermittentnrg

Thanks for the heads up, @intermittentnrg. We will need to find a different solution. Do you have any suggestions?

amotl avatar Apr 21 '24 02:04 amotl

I came up with this selector for removing padding around panels:

$(".scrollbar-view > div").css("padding", "0");

Using :has() psuedo selector and similar trickery can also work, but I'm not sure it can work for all cases. My styling needs are simple tho.

The recommendation by Grafana is to use plugins? / adding files and rebuilding Grafana? Not really keen on this approach, I use their official docker image.

Anyway should we perhaps remove all broken css manipulation from grafana-studio.js? It can live on/die in git history.

The kiosk enabling doesn't work either but appending &kiosk=1 to query string works.

intermittentnrg avatar Apr 21 '24 12:04 intermittentnrg

Hi @intermittentnrg. We will be happy to accept any patches to modernize grafanimate, making it compatible with recent versions of Grafana. We haven't been able to keep up with maintenance, and we are about to migrate it to https://github.com/grafana-toolbox. Let us know if we should add you as a collaborator on this project while we are already on the refactoring, so we can share future maintenance, if you like that idea.

amotl avatar Apr 21 '24 15:04 amotl

Thank you! I will still need your advice. I have used python on and off over the years, but don't usually use it.

I'm now experiencing skipped frames in my videos, not entirely sure why but I suspect waiting for state=Done is not enough.

I've just setup a pipeline in my local Jenkins, turns out rendering a 1-2 minute video takes over an hour. But it's possibly slower now as it's running on a Raspberry Pi 4b. grafanimate job in jenkins Some failures from marionette/firefox is sometimes not booting correctly in container, unsure why.

Using scenarios-env.py that I created

def scenario():
    return AnimationScenario(
        grafana_url=os.environ["GRAFANA_URL"],
        dashboard_uid=os.environ["DASHBOARD_UID"],
        sequences=[
            AnimationSequence(
                start=parse(os.environ['START']),
                stop=parse(os.environ['STOP']),
                recurrence=RecurrenceInfo(
                    frequency=HOURLY,
                    interval=int(os.environ['STEP_HOURS']),
                    duration=timedelta(days=int(os.environ['WINDOW_DAYS'])),
                    #every=None
                ),
                #every=None,
                mode=SequencingMode.WINDOW
            )
        ],
    )

intermittentnrg avatar Apr 21 '24 16:04 intermittentnrg

I've just setup a pipeline in my local Jenkins, turns out rendering a 1-2 minute video takes over an hour.

I guess it is because synchronization with Grafana is currently utterly broken, never has been stable, and we currently don't have a good solution. Or do we? Sorry if I lost track about any recent improvements from your pen, please educate me if I'm wrong.

Because synchronization is broken, rendering each frame will take so long because it will fully consume the timeout each time for each frame again. This makes usage unbearable.

If it works well on your workstation, but does not on your Raspberry Pi, it is yet another sign that this subsystem would need to be improved significantly. Until it is, please don't run it on a Pi.

amotl avatar Apr 21 '24 21:04 amotl

I've just setup a pipeline in my local Jenkins, turns out rendering a 1-2 minute video takes over an hour.

I guess it is because synchronization with Grafana is currently utterly broken, never has been stable, and we currently don't have a good solution.

Another thing comes to mind: Didn't the code introduce manual exposure times some time ago? Anyway, all of that should probably not be discussed on behalf of this PR draft [^1]. a) Should it? b) Maybe GH-16 is more appropriate if it's the same thing you are observing?

[^1]: The thing is, when this PR will be merged or otherwise closed, this discussion, making up a part of improving synchronization matters, will be out-of-band to the other conversation, so things would become more fragmented.

amotl avatar Apr 21 '24 22:04 amotl

I guess it is because synchronization with Grafana is currently utterly broken

Got it. Reading up on the conversation we had on this PR, I see that we may have stopped over at https://github.com/grafana-toolbox/grafanimate/pull/19#issuecomment-1763488227:

Do you think you may have the capacity to squeeze your code into that box, effectively replacing the ingredients of Grafana Studio's onDashboardRefresh? If you don't, please tell me, so I will pick it up on the next iteration.

amotl avatar Apr 21 '24 22:04 amotl

I have these 2 checks.

panels = Object.values(window.wrappedJSObject.grafanaRuntime.getPanelData())
return panels && panels.every(function(o) {return o?.state=='Done'})
return $('[aria-label="Panel loading bar"]').length == 0

But they maybe only check if data request is complete, and not that the canvas has been redrawn.

It mostly works! But I'm noticing some skipped frames when running on Pi, I have a kubernetes cluster with 4x Pi. And it's probably fine on my desktop PC. But if it works consistently on Pi then it must be fully solved right?

Could compare the screenshot image if it's different to previous? But probably not a nice way to do it.

intermittentnrg avatar Apr 21 '24 23:04 intermittentnrg

Also had a lot of startup errors, timeout and JS errors. Just installed vnc server inside the docker image to see what's going on as logs and gecko.log weren't helpful...

Inspired by docker-seleinum which has this super useful VNC feature: https://github.com/SeleniumHQ/docker-selenium?tab=readme-ov-file#using-a-vnc-client

intermittentnrg avatar Apr 22 '24 00:04 intermittentnrg

Just comparing the screenshot contents would be very reliable and simple.. Not sure if I like the idea tho.

https://stackoverflow.com/questions/34669068/how-to-verify-that-two-images-are-exactly-identical https://stackoverflow.com/questions/52736154/how-to-check-similarity-of-two-images-that-have-different-pixelization https://stackoverflow.com/questions/23982960/fast-and-efficient-way-to-detect-if-two-images-are-visually-identical-in-python

intermittentnrg avatar Apr 22 '24 16:04 intermittentnrg