cudf icon indicating copy to clipboard operation
cudf copied to clipboard

[DOC] Elaborate dataset size vs hardware requirements

Open galipremsagar opened this issue 1 year ago • 2 comments

Report needed documentation

Report needed documentation Hardware requirements for cuDF Pandas could be more explicit. Developers should expect to have 3x the size of their dataset in GPU VRAM to take advantage of the accelerated processing. Hardware requirements are currently not elaborated on in the documentation.

galipremsagar avatar Oct 15 '24 19:10 galipremsagar

@galipremsagar Please see #16693 and #16869 for related discussions. It is important for users to know that the choices of algorithms, data types, file formats, and more can affect the hardware requirements. Requiring VRAM that is 3x the data size is a very rough rule of thumb, but that multiple can vary a lot (some algorithms require 5x or more) and is also impacted by cudf.pandas' use of unified memory and prefetching to expand the upper bounds of acceleration.

bdice avatar Oct 15 '24 19:10 bdice

This was a VDR request, and I suspect it is what #16869 was intended to address (@singhmanas1 could confirm). If that is the case, I think that we can close this as being resolved by #16693.

vyasr avatar Oct 16 '24 19:10 vyasr

I'm going to close this as addressed by the PRs mentioned above.

vyasr avatar Oct 21 '24 19:10 vyasr