ArchiveBot icon indicating copy to clipboard operation
ArchiveBot copied to clipboard

Poor dashboard performance scaling

Open JustAnotherArchivist opened this issue 5 years ago • 2 comments

Following #367 and #383, the control node is now easily able to handle a couple hundred parallel jobs. However, the dashboard JS is unable to keep up with that. Earlier today, when we were reaching our current pipeline capacity, the dashboard easily ate nearly 100 % CPU (one core) on a reasonably modern machine of mine. Another older machine I have hasn't been able to run the dashboard for months already. I suspect that #378 is also a performance issue. The beta dashboard seems to be worse in this respect than the standard one.

The dashboard needs some optimisation to stay usable on slower machines and as we scale the whole system up further in the future.

JustAnotherArchivist avatar May 31 '19 02:05 JustAnotherArchivist

@JustAnotherArchivist is this fixed for you?

ivan avatar Jun 30 '23 04:06 ivan

@ivan It's definitely much better thanks to #558, but I wouldn't consider this solved. If we scale up (there are still at least three big pipelines out of operation), we will run into similar problems again. I suspect the only way to really fix it is to reduce the amount of data the dashboard client has to process in the first place. This would require a change in the entire WebSocket communication to introduce a pubsub scheme where the client would subscribe to visible jobs and only receive regular stats updates for the rest.

JustAnotherArchivist avatar Jun 30 '23 17:06 JustAnotherArchivist