nbdime
nbdime copied to clipboard
Memory consumption
I run nbdiff for relatively not big file(~1.7mb) and memory goes up to ~3.5GB
What is kinda surprising, given that number of cells, that have differences is not huge.
I wonder if it make sense to do some local diffs for non-exact cells? PS did not read much of code, so might sound not very smart 🙈
Potentially this library looks quite fast to resolve diff
https://github.com/google/diff-match-patch
That sounds like a lot. Is this a notebook you are able to share? It could be very useful for profiling. Depending on where the issue is, likely improvements are:
- Finish the Myers algorithm for diffing instead of brute forcing it ( #402 ).
- Tweak comparison operators for certain output types.
- Ensure no dangling refs prevent GC of old objects.
Note: The actual library used for doing text diffs are unlikely to affect this issue, but that should of course be considered as well.
Closing due to missing repro.
I have a file where use of nbdiff seems to grow without bound. To reproduce:
- Download https://github.com/afeld/python-public-policy/blob/129d5150e1796ecde2c947b4694c3430f388c8a1/lecture_3.ipynb
- Run
nbdiff lecture_3.ipynb lecture_3.ipynb