creek icon indicating copy to clipboard operation
creek copied to clipboard

NoMethodError: undefined method any? in creek/sheet.rb; related to Memory Usage

Open alexhornick-dt opened this issue 8 months ago • 3 comments

Starting with Creek 2.6.2, we were getting the following exception when stitching together large Excel documents:

NoMethodError: undefined method `any?' for nil:NilClass\n/usr/local/bundle/ruby/2.7.0/gems/creek-2.6.3/lib/creek/sheet.rb:107:in `block (3 levels) in rows_generator'\n\t/usr/local/bundle/ruby/2.7.0/gems/nokogiri-1.13.10-x86_64-linux/lib/nokogiri/xml/reader.rb:100:in `each'\n\t/usr/local/bundle/ruby/2.7.0/gems/creek-2.6.3/lib/creek/sheet.rb:106

And alongside this exception, we seemed to be hitting our max memory allocation, causing our process to crash. After seeing the resolution of https://github.com/pythonicrubyist/creek/issues/111 in 2.6.3, we tried upgrading, and saw better performance, but we still got this exception on large files. We've since reverted to 2.5.3 and haven't had the issue yet since.

I ran a quick test comparing memory usage of 2.5.3 and 2.6.3. I used three Excel files of varying sizes (20MB, 91.6MB, and 80MB), for a combined total of 190MB.

Creek 2.5.3 stitched the files together successfully, and seemed to peak at around 1-1.1GB of RAM. Creek 2.6.3 failed to stitch the files together, the highest peak I saw was around ~7GB but it may have crashed after 8GB.

alexhornick-dt avatar Oct 26 '23 17:10 alexhornick-dt