gukejun1

Results 14 comments of gukejun1

@tjruwase Thank you for solving the puzzle. There are two questions: 1. What are the heterogeneous memory optimizations performed by DeepSpeed? Which of the following APIs is used? 2, Does...

https://nvidia-merlin.github.io/Merlin/main/examples/scaling-criteo/02-ETL-with-NVTabular.html# When run this code

the code is same from [this]( https://nvidia-merlin.github.io/Merlin/main/examples/scaling-criteo/02-ETL-with-NVTabular.html# ), I use docker images(nvcr.io/nvidia/merlin/merlin-tensorflow 22.12)

@rnyak this is my full code ``` BASE_DIR = os.environ.get("BASE_DIR", "/raid/data/criteo") INPUT_DATA_DIR = os.environ.get("INPUT_DATA_DIR", BASE_DIR + "/converted/criteo") OUTPUT_DATA_DIR = os.environ.get("OUTPUT_DATA_DIR", BASE_DIR + "/test_dask/output") USE_HUGECTR = bool(os.environ.get("USE_HUGECTR", "")) print(USE_HUGECTR) stats_path =...

@rnyak The training data is from the first 40 million rows of day_0 in the criteo data set, and the verification data is from the first 4 million rows of...

@rnyak [day_1_100.parquet.txt](https://github.com/NVIDIA-Merlin/NVTabular/files/10819818/day_1_100.parquet.txt) The data comes from the first 100 lines of data in criteo day_1 and is converted to the parquet file using the official method \Merlin\examples\scaling-criteo\01_download_convert.ipynb.

@rnyak Very strange. ![image](https://user-images.githubusercontent.com/33796561/221343625-84b6a632-fcc2-4d45-b716-23c90ecfe3be.png)

@rnyak It's the same. The only difference is that the file name extension is in the parquet format. Because GitHub cannot upload files with the parquet file name extension, the...

@rnyak So, this code didn't work. ![image](https://user-images.githubusercontent.com/33796561/221451837-dee5366f-7246-4592-bd31-82ec7d1fbb97.png)

@rnyak Because my graphics card supports up to cuda 11.3, so I reinstalled cupy-cuda to 113. Is it related to this? Does cupy-cuda 113 support populating missing values?