Mergely icon indicating copy to clipboard operation
Mergely copied to clipboard

Performance degradation: tens of thousands of rows become unusable

Open Boom1597 opened this issue 4 years ago • 6 comments

I ran into the same problem as issue#83 when I compare more than 40,000 rows of data. I want to know if this is a limitation of mergely, or there are other ways to solve it

Boom1597 avatar Jan 28 '21 07:01 Boom1597

@Boom1597, performance can vary, depending on the number of lines, and the number changes within the lines, and even the CPU/memory available to the browser. There are a few of ways to improve performance, but each degrade the experience a bit. You an check out the docs, and search for "performance", but the options are: viewport, sidebar, and lcs. If setting those options still does not give satisfactory performance, then maybe post an example.

wickedest avatar Jan 28 '21 12:01 wickedest

@wickedest ,thanks for answering, I tried those options you mentioned but didn't take effect. here is an example: https://codepen.io/boom1597/pen/gOLOMyK or you can use my file: https://gist.githubusercontent.com/Boom1597/d19a5cb91d11ae5f7c69bbaa5344bad0/raw/0e548bd3a028fc9003cd34ae6c33f97e3b5cc580/gistfile1.txt What puzzles me is that before the result is loaded, the text has appeared on the interface, but in fact it has not been loaded yet. That will confuse my users that they think this is a bug. Thanks again for your answer.

Boom1597 avatar Jan 29 '21 09:01 Boom1597

@Boom1597, thanks for taking the time to create the gist and the codepen. I can reproduce your issue. Tweaking those options had no effect because most of the time is spent in the diff algorithm itself. The input file is 1.1 MB and is comparing against a file with no commonality to it, and is doing the best it can to find the longest common subsequence (LCS). I accept that it's not performing well in this instance, especially when GNU diff takes a fraction of a second. It's been a long time (9 years) since I coded the algorithm. I think there are some performance improvements that can be made but as you can imagine it's fairly technical and difficult to optimize. I'll keep it on my TODO list to research further and find ways to improve the performance.

wickedest avatar Jan 31 '21 11:01 wickedest

Thanks for your attention.

Boom1597 avatar Feb 01 '21 02:02 Boom1597

Is there any update on the performance optimisation front. I have also looking for the fix.

kmanikandanmca2008 avatar Apr 06 '22 02:04 kmanikandanmca2008

@kmanikandanmca2008, I started looking into it a few weeks ago. The algorithm I'm using for shortest middle snake is recursive and doesn't perform well. I'm researching alternatives.

wickedest avatar Apr 06 '22 06:04 wickedest

:tada: This issue has been resolved in version 5.0.0-alpha.1 :tada:

The release is available on:

Your semantic-release bot :package::rocket:

github-actions[bot] avatar Apr 22 '23 15:04 github-actions[bot]

:tada: This issue has been resolved in version 5.0.0 :tada:

The release is available on:

Your semantic-release bot :package::rocket:

github-actions[bot] avatar Apr 23 '23 15:04 github-actions[bot]