complex-upset icon indicating copy to clipboard operation
complex-upset copied to clipboard

Large memory use for very large datasets (10^9 observations)

Open ghost opened this issue 4 years ago • 1 comments

Hello,

Not sure it's an issue actually, I mean that I don't know if there is something that can be done about this but: I tried with a combination matrix of 10^9 rows x 2 sets/ It didn't go well ... 64 Go of ram is not enough.

ghost avatar Apr 22 '20 18:04 ghost

Sorry about it :( I am aware that it is not as performance optimized as it could be; design was focused on being correct first and performant second. I will have a look into that at some time (I am still learning about profiling and optimizations in R), but in the meantime, I welcome suggestions and contributions. Also, this was already mentioned in #12 (and 10^9 is a large number).

On top of my head, if the implementation still uses strings (character vectors) rather than factors, changing this should provide a substantial memory-saving boost.

krassowski avatar Apr 22 '20 19:04 krassowski