exist icon indicating copy to clipboard operation
exist copied to clipboard

Strange performance difference between xmldiff:compare() and deep-equal()

Open xatapult opened this issue 4 years ago • 7 comments

Describe the bug We have tried to do XML compares using xmldiff:compare() and the XPath equivalent deep-equal().

For comparing two equal documents of ~200Kb xmldiff:compare() on my machine taks 60msecs, deep-equal() over 9 seconds!

Expected behavior The functions vary slightly in interface, but just comparing two documents should roughly take the same amount of time. And if not 9 seconds is a bit too much I think.

To Reproduce Unzip, import in eXist, execute test-equals.xql (I'm doing this from oXygen)

Context (please always complete the following information):

  • OS: W10
  • eXist-db version: 5.3
  • Java Version Amazon Corretto 1.8.0_202

Additional context

  • How is eXist-db installed? Jar
  • Any custom changes: various but none seem relevant (mostly external SQL database things)

xatapult avatar Sep 29 '21 10:09 xatapult

TEST.zip

xatapult avatar Sep 29 '21 10:09 xatapult

AFAIK for xmldiff:compare() two files are serialised and compared with an external tool; I suspect that deep-equal() uses a different mechanism and might compare more?

dizzzz avatar Oct 10 '21 09:10 dizzzz

I have recently played around with fn:deep-equal#2 and am wondering what it actually should do. It does not do well comparing two node-sets from my testing. I ended up serialising both elements and calling deep-equal on that.

line-o avatar Nov 17 '21 22:11 line-o

@dizzzz @adamretter I just heard that there are plans to ditch xmldiff:compare. I'd really hope for this issue to be looked into before this is effectuated.

PieterLamers avatar Feb 01 '22 22:02 PieterLamers

@PieterLamers I don't think we would intentionally remove it if there is not a good replacement. However, please remind me as we prepare eXist 7, so we don't remove it by accident as part of our tech debt cleanup

adamretter avatar Feb 01 '22 23:02 adamretter

@PieterLamers So we definitely are not removing xmldiff:compare - as we just added a new implementation of it here - https://github.com/eXist-db/exist/pull/4554

adamretter avatar Sep 09 '22 12:09 adamretter

@xatapult I reran your tests on my laptop:

  1. With the new implementation (#4554) of xmldiff:compare: 228ms
  2. With the new implementation (#4529) of fn:deep-equals: 12871ms

It looks like we would still need to optimise fn:deep-equals...

adamretter avatar Sep 09 '22 13:09 adamretter