NVTabular icon indicating copy to clipboard operation
NVTabular copied to clipboard

NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.

Results 172 NVTabular issues
Sort by recently updated
recently updated
newest added

I see that HugeCTR recently added multi-hot categorical variable column compatibility from parquet files. In order to use that, I was wondering what data type we should use to save...

question

We should update our movielens example for HugeCTR to incorporate the multihot 'genres' field since HugeCTR supports multihot parquet inputs now. We should be able to use the same preprocessing...

HugeCTR
P1

**What questions are you trying to answer? Please describe.** We have NVTabular + NVTabular dataloaders to train on multi-GPUs. We provide multiple examples how to scale to multi-GPU. We want...

**What questions are you trying to answer? Please describe.** What are the best practices to run NVTabular workflows and training TensorFlow models with NVTabular data loader on cloud platforms.

**Is your feature request related to a problem? Please describe.** The PyT and TF Dataloader support padding list (sparse) features to the right, which means that shorter list sequences will...

dataloader

Current version of bucketize uses fixed boundaries. If the user doesn't know these boundaries they need to calculate them using cudf. We should support splitting continuous variables into buckets based...

enhancement
good first issue

**Is your feature request related to a problem? Please describe.** There seems to be a minimum `freq_threshold` for Categorify to remove long-tail elements but I feel there are also some...

ops

**Issue by [jperez999](https://github.com/jperez999)** _Tuesday Jan 28, 2020 at 05:39 GMT_ _Originally opened as https://github.com/rapidsai/recsys/issues/14_ ---- **Is your feature request related to a problem? Please describe.** Need to ensure that no...

**Issue by [oyilmaz-nvidia](https://github.com/oyilmaz-nvidia)** _Wednesday Feb 26, 2020 at 15:57 GMT_ _Originally opened as https://github.com/rapidsai/recsys/issues/17_ ---- **Is your feature request related to a problem? Please describe.** It's not exactly a problem...

**Issue by [alecgunny](https://github.com/alecgunny)** _Tuesday Mar 24, 2020 at 19:44 GMT_ _Originally opened as https://github.com/rapidsai/recsys/issues/29_ ---- **Is your feature request related to a problem? Please describe.** DLLabelEncoder by default reserves 0...