really slow rendering in browser
Original author: [email protected] (June 11, 2012 12:24:49)
What steps will reproduce the problem?
- Load attached xml data (same as in issue #137).
- Even the initial display is really slow.
- Click 'next>>' in the pager.
What is the expected output? What do you see instead? It takes about 25 seconds on 4 GB Xeon E31245 to render 10 records (about 80 rows). A second or two for rendering would be acceptable. See attached screenshot.
What version of Google Refine are you using? Tried with 2.5 and SVN trunk checkout.
What operating system and browser are you using? Win7, Chrome 19 and Opera 11, both browsers go haywire.
In issue #137 @tfmorris noticed that this could be caused by large number of null cells. Can someone verify this?
Original issue: http://code.google.com/p/google-refine/issues/detail?id=583
From [email protected] on June 11, 2012 12:41:10: Excuse the typo - it takes 15 seconds on the screenshot, but I noticed also times around 30 seconds in similar cases - probably dependent on the real number of rows in each record.
From tfmorris on September 08, 2012 02:58:29: I've tracked this down to the column width resizing code in data-table-view.js. DataTableView._adjustDataTables() is causing an enormous amount of time to be spent in jQuery's cssHooks.get() method, but I'm not sure what the fix is yet.
Quitting for now, so recording for posterity...
XML rendering in the browser is still very, very, very slow, even with an xml of a very average size and complexity. I only go through sites like this one to try to get a CSV that will then import into Open Refine.

(Open Refine 2.7, Windows 10, Chrome)
File used for the screencast is in attachment.
@ettorerizza Yeah, we know... Jackson library does fine for parsing it out... the problem is the Record row rendering is slow. I'll mark this has higher priority to look into and fix eventually. Thanks Ettore !
Removing import and XML tags as this is not actually specific to any importer: OpenRefine is just slow when displaying many columns. This is mostly due to the way the DOM is updated: the current code triggers a lot of reflows which are very expensive. This is due to the bad practice of computing in Javascript the dimensions of many elements. We should instead:
- only use CSS for layout - the browser picks element dimensions itself;
- use culling (do not create dom elements for columns that are not visible).
This might be best implemented by migrating to a reactive UI framework. It could be done in a selective way for the data grid with Vue which can be used selectively.
I put some of my previous research on Data grids into this wiki page: Research on data grids
@thadguidry re data grid libraries - I think we need to just check that we solve the right problem - a lot of the focus for those libraries is providing tools to manipulate the tabulated data in the browser but I don’t think that’s specifically what we need at least in the current setup?
@ostephens There certainly could be a quick fix applied via Jquery and Javascript by you or someone else on this issue. I'm always looking at the bigger picture; the one that David and I never got to finish.
As we continue to get more and more contributors, its going to be important that we have better project planning in the form of guides, strategy, and documented use cases that we can handle better.
Slowly throwing one of them together in this OpenRefine UI - High Level Design document
@ostephens @wetneb re "why is Thad always changing the focus on issues sometimes?!" ;-) As engineers, you hate losing focus, I know. I felt the problem recently also myself yesterday, which is why I started the document. Part of the problem is that we don't have integrated project planning in GitHub (I'm used to enterprise Jira). So, instead I'm going to do much more planning outside of GitHub issues and add links to issues inside those documents as well as use the dev mailing list more. On Schema.org we used a few master issues with Tasks to link to issues (our pseudo-epics), but that was confusing, so we'll avoid that for us, and keep our issues clean and workable for engineers. Thanks for listening.
@ostephens: I’m curious: is it easy/possible ton integrate Jira with GitHub?
Regards, Antoine