json-diff icon indicating copy to clipboard operation
json-diff copied to clipboard

Doesn't work with large JSON files

Open colegleason opened this issue 10 years ago • 8 comments

Trying to compare two files of size 701MB results in

FATAL ERROR: CALL_AND_RETRY_0 Allocation failed - process out of memory

colegleason avatar Jul 16 '14 22:07 colegleason

Also its really slow for large files

wanderer avatar Dec 15 '14 22:12 wanderer

Patches are welcome! ;-)

andreyvit avatar Dec 16 '14 02:12 andreyvit

My file is only 854K, but it seems to running forever, going to leave it overnight to see if it can finish...

miranda-zhang avatar Oct 27 '21 17:10 miranda-zhang

I think it's working more better in terminal mode than js implementation in browser. But I have not largest files to confirm.

mistertest avatar Apr 23 '22 03:04 mistertest

Any news regarding this? I'm facing with the same issue right now With 2 files of 20-30MB each one (almost 1M lines)

idanElitzur avatar Jun 12 '23 14:06 idanElitzur

I don't get an out of memory exception with my 60MB json files. It just never finishes and my job agents kills it after an hour, I left it running longer than that locally but after most of a day I called it quits. Performance seems to degrade exponentially with file-size as my 7MB files completes "nicely" after about 150 seconds comparatively. Would be nice if this performed at linear speed with file size.

philipborg avatar Jul 10 '23 13:07 philipborg

@philipborg I had the same issue.. it was working before I made changes to my tsconfig.

It just hangs.

@andreyvit I forget exactly what changes I made but it was something like changing my tsconfig to target: "esnext"

mrmianbao avatar Aug 31 '23 04:08 mrmianbao

I have found a workaround for my needs I thought I'd share. It seems that comparing large arrays of objects is the issue. Replacing the array with an object, so it only compares key matches, solved it for me. Now it goes decently fast even with 60MiB files. My objects themself contains arrays of strings which doesn't seem to cause any problems.

So instead of

[
    {
        "field1": 2,
        "field2": "foo",
        [...]
    },
    {
        "field1": 1337,
        "field2": "bar",
        [...]
    }
]

I used something more akin to

{
    "MyIdentifier1": {
        "field1": 2,
        "field2": "foo",
        [...]
    },
    "MyIdentifier2": {
        "field1": 1337,
        "field2": "bar",
        [...]
    }
}

This only works if you can find an unique identifier in the data to use as the object key. In my case I had to do a composite key and escape my separators. It worsens the comparison and requires you to mutate the json before comparing, so it's just a workaround and not an issue solution.

Essentially it restricts the search scope for what it will compare objects with as now it only compares objects with matching keys.

philipborg avatar Nov 13 '23 15:11 philipborg