nbdime icon indicating copy to clipboard operation
nbdime copied to clipboard

Memory consumption

Open lc0 opened this issue 6 years ago • 5 comments

I run nbdiff for relatively not big file(~1.7mb) and memory goes up to ~3.5GB

What is kinda surprising, given that number of cells, that have differences is not huge.

I wonder if it make sense to do some local diffs for non-exact cells? PS did not read much of code, so might sound not very smart 🙈

lc0 avatar Apr 27 '19 21:04 lc0

Potentially this library looks quite fast to resolve diff

https://github.com/google/diff-match-patch

lc0 avatar Apr 27 '19 21:04 lc0

That sounds like a lot. Is this a notebook you are able to share? It could be very useful for profiling. Depending on where the issue is, likely improvements are:

  • Finish the Myers algorithm for diffing instead of brute forcing it ( #402 ).
  • Tweak comparison operators for certain output types.
  • Ensure no dangling refs prevent GC of old objects.

vidartf avatar Apr 28 '19 14:04 vidartf

Note: The actual library used for doing text diffs are unlikely to affect this issue, but that should of course be considered as well.

vidartf avatar Apr 28 '19 14:04 vidartf

Closing due to missing repro.

vidartf avatar Sep 28 '20 11:09 vidartf

I have a file where use of nbdiff seems to grow without bound. To reproduce:

  1. Download https://github.com/afeld/python-public-policy/blob/129d5150e1796ecde2c947b4694c3430f388c8a1/lecture_3.ipynb
  2. Run nbdiff lecture_3.ipynb lecture_3.ipynb

afeld avatar May 04 '21 04:05 afeld