FlatFiles icon indicating copy to clipboard operation
FlatFiles copied to clipboard

OutOfMemoryException on large files

Open svaldetero opened this issue 2 years ago • 2 comments

Describe the bug Out of memory exceptions with extremely large files

To Reproduce I have a fixed-width file (that is also comma-separated). The lines are ~1500 characters long, there are ~13 million rows, and it is ~20GB in size. I am using the IFixedLengthTypedReader from a IFixedLengthTypeMapper. I used the .CustomMapping().WithReader() to define the widths and destination properties. There are 90+ columns in the file. Using while(IFixedLengthTypedReader.Read()) to loop throw the rows. After ~ 1.5 million rows, I get an out of memory exception in RetryReader. Looking at VS diagnostics, it looks like StringBuilder is the culprit. Here's the stack trace

   at System.Text.StringBuilder.ExpandByABlock(Int32 minBlockCharCount)
   at System.Text.StringBuilder.Append(Char value, Int32 repeatCount)
   at System.Text.StringBuilder.Append(Char value)
   at FlatFiles.RetryReader.Read()
   at FlatFiles.FixedLengthRecordParser.SeparatorRecordReader.ReadRecord()
   at FlatFiles.FixedLengthReader.ParsePartitions()
   at FlatFiles.FixedLengthReader.Read()

There's no way for me to release that memory. This is not from the objects created.

Expected behavior I expect to not get an out of memory exception just from looping over the rows in the file. I'm not even doing anything with the mapped object.

Screenshots n/a

Version: 5.0.1

Additional context n/a

svaldetero avatar May 25 '22 21:05 svaldetero

I had concerns when I added that code to capture an entire line. I had to convince myself it wouldn't be a problem because that memory gets cleared between each row. It only hangs out in the reader long enough to broadcast it with events, if necessary. 1500 characters doesn't seem like it would cause a problem.

I'll take a look, just to be sure.

On Wed, May 25, 2022, 5:00 PM Seth Valdetero @.***> wrote:

Describe the bug Out of memory exceptions with extremely large files

To Reproduce I have a fixed-width file (that is also comma-separated). The lines are ~1500 characters long, there are ~13 million rows, and it is ~20GB in size. I am using the IFixedLengthTypedReader from a IFixedLengthTypeMapper. I used the .CustomMapping().WithReader() to define the widths and destination properties. There are 90+ columns in the file. Using while(IFixedLengthTypedReader.Read()) to loop throw the rows. After ~ 1.5 million rows, I get an out of memory exception in RetryReader. Looking at VS diagnostics, it looks like StringBuilder is the culprit. Here's the stack trace

at System.Text.StringBuilder.ExpandByABlock(Int32 minBlockCharCount) at System.Text.StringBuilder.Append(Char value, Int32 repeatCount) at System.Text.StringBuilder.Append(Char value) at FlatFiles.RetryReader.Read() at FlatFiles.FixedLengthRecordParser.SeparatorRecordReader.ReadRecord() at FlatFiles.FixedLengthReader.ParsePartitions() at FlatFiles.FixedLengthReader.Read()

There's no way for me to release that memory. This is not from the objects created.

Expected behavior I expect to not get an out of memory exception just from looping over the rows in the file. I'm not even doing anything with the mapped object.

Screenshots If applicable, add screenshots to help explain your problem.

Version: 5.0.1

Additional context Add any other context about the problem here.

— Reply to this email directly, view it on GitHub https://github.com/jehugaleahsa/FlatFiles/issues/86, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKOAPTVMYRGR6RWLLJPJYLVL2IHBANCNFSM5W6PX3WQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

jehugaleahsa avatar May 26 '22 11:05 jehugaleahsa

I figured it out. Check out 5.0.2 once it becomes available from NuGet. Let me know if you encounter any issues. Thanks for the feedback and diagnostic information.

jehugaleahsa avatar May 27 '22 00:05 jehugaleahsa

Closing due to inactivity.

jehugaleahsa avatar Oct 03 '22 00:10 jehugaleahsa