dataloader icon indicating copy to clipboard operation
dataloader copied to clipboard

The merlin dataloader lets you rapidly load tabular data for training deep leaning models with TensorFlow, PyTorch or JAX

Results 22 dataloader issues
Sort by recently updated
recently updated
newest added

This PR fixes #163. This helps solve is isaligned check issue that occurs with ragged columns in tensorflow.

bug

I tried the Pytorch loader but it is giving me the following error: `BufferError: DLPack only supports signed/unsigned integers, float and complex dtypes.` but if I switched Tensorflow it works....

### Bug description I am trying to extract embedding but the following options do not work. Option 1: I tried these scripts but none works: ``` model_transformer.query_embeddings(train, index='session_id') or model_transformer.query_embeddings(train,...

bug
P0

**Describe the issue**: Dataloader accumulates GPU memory across batches if not manually calling `gc.collect()` after each batch or after every e.g every 5th batch. See example below, manually calling garbage...

bug
P1

### Bug description In data parallel training, we start multiple workers with different initialization of the dataloader and train with horovod. After each batch update, the parameters are synced. Merlin...

bug
P1

Fixes #54 Update instructions for conda install to specify correct minimum version of Python (3.8) and separate conda install from conda environement creation

chore

Adds a fixture to cleanup dataloader after each test runs. This ensures that if a test using the Merlin Dataloader only partially consumes a dataloader instance (and isn't using it...

chore

Specify merlin dependencies in setup.py to create release specifier that matches current release for merlin dependencies Development - tag will be something like `23.12.dev0+1.ge73d8ba` - `merlin_dependency("merlin-core")` returns `merlin-core` (unpinned) Release...

ci

As of 12372f4c6562f296c510f6734e748ef54c375c33, device assignment in the PyTorch dataloader does not work correctly with multiple GPUs. ```python import os import pandas as pd from merlin.dataloader.torch import Loader from merlin.io.dataset import...