NeMo-Curator icon indicating copy to clipboard operation
NeMo-Curator copied to clipboard

[FEA] Add examples showing how to use both CPU & GPU modules together

Open ayushdg opened this issue 1 year ago • 2 comments

Is your feature request related to a problem? Please describe. The codebase has some tutorials/examples showcasing CPU only or GPU only modules, but not both. It would be good to have examples that show using both and how users can convert their dataset to go between using CPU & GPU modules.

Describe the solution you'd like This came up when trying to combine fuzzy dedup with cpu modules leading to a typeError expected data of type cudf.

Describe alternatives you've considered Longer term there should be means of handling the conversion automatically but for the time being an example showing how users can go between the two is good.

ayushdg avatar May 15 '24 00:05 ayushdg