gukejun1
gukejun1
@tjruwase Thank you for solving the puzzle. There are two questions: 1. What are the heterogeneous memory optimizations performed by DeepSpeed? Which of the following APIs is used? 2, Does...
https://nvidia-merlin.github.io/Merlin/main/examples/scaling-criteo/02-ETL-with-NVTabular.html# When run this code
the code is same from [this]( https://nvidia-merlin.github.io/Merlin/main/examples/scaling-criteo/02-ETL-with-NVTabular.html# ), I use docker images(nvcr.io/nvidia/merlin/merlin-tensorflow 22.12)
@rnyak this is my full code ``` BASE_DIR = os.environ.get("BASE_DIR", "/raid/data/criteo") INPUT_DATA_DIR = os.environ.get("INPUT_DATA_DIR", BASE_DIR + "/converted/criteo") OUTPUT_DATA_DIR = os.environ.get("OUTPUT_DATA_DIR", BASE_DIR + "/test_dask/output") USE_HUGECTR = bool(os.environ.get("USE_HUGECTR", "")) print(USE_HUGECTR) stats_path =...
@rnyak The training data is from the first 40 million rows of day_0 in the criteo data set, and the verification data is from the first 4 million rows of...
@rnyak [day_1_100.parquet.txt](https://github.com/NVIDIA-Merlin/NVTabular/files/10819818/day_1_100.parquet.txt) The data comes from the first 100 lines of data in criteo day_1 and is converted to the parquet file using the official method \Merlin\examples\scaling-criteo\01_download_convert.ipynb.
@rnyak Very strange. 
@rnyak It's the same. The only difference is that the file name extension is in the parquet format. Because GitHub cannot upload files with the parquet file name extension, the...
@rnyak So, this code didn't work. 
@rnyak Because my graphics card supports up to cuda 11.3, so I reinstalled cupy-cuda to 113. Is it related to this? Does cupy-cuda 113 support populating missing values?