bschifferer
bschifferer
If we use NVTabular to process multiple input files, it will generate multiple output files with the same shapes: ``` df = cudf.DataFrame({ 'col1': np.random.randint(0,1,size=100_000_000) }) df.to_parquet('single2_1.parquet') df = cudf.DataFrame({...
Memory Foot Print ``` # Creating the dataset import cudf import os import numpy as np from merlin.loader.tensorflow import Loader from merlin.io import Dataset df = cudf.DataFrame({ 'col1': np.random.random(size=500_000_000) })...
Memory Foot Print II Dataset Creation ``` import cudf import os import numpy as np from merlin.loader.tensorflow import Loader from merlin.io import Dataset df = cudf.DataFrame({ 'col1': np.random.random(size=500_000_000) }) df.to_parquet('/raid/single2_1.parquet')...
``` import cudf import numpy as np df = cudf.DataFrame({ 'user_id': np.random.randint(0,10,size=10_000_000), 'item_id': np.random.randint(0,10,size=10_000_000), 'target': np.random.randint(0,2,size=10_000_000), }) df.to_parquet('/raid/single2_1.parquet') df = cudf.DataFrame({ 'user_id': np.random.randint(0,10,size=9_000_000), 'item_id': np.random.randint(0,10,size=9_000_000), 'target': np.random.randint(0,2,size=9_000_000), }) df.to_parquet('/raid/single2_2.parquet') ```...
I dont have the example available, right now. It happens in this line: https://github.com/NVIDIA-Merlin/systems/blob/main/merlin/systems/triton/__init__.py#L55 ` I don't think Triton supports nulls in tensors` - This is an issue when we...
@viswa-nvidia why do you retire this bug? The PR is a short-term workaround. I run into the same bug, right now.
@karlhigley I am sorry, I didnt know that this was missing. ``` import cudf import tritonclient.grpc as grpcclient from merlin.systems.triton import convert_df_to_triton_input df = cudf.DataFrame({ 'col1': [0,1,None,2,3,None], 'col2': [0.0, 1.0,...
@viswa-nvidia - Yes, this is still relevant