blazingsql icon indicating copy to clipboard operation
blazingsql copied to clipboard

[BUG] byte_range offset with header not supported

Open marberi opened this issue 2 years ago • 0 comments

Describe the bug Problem reading in csv file. It generate the following error:

BlazingContext ready [19:37:44.175] [error] |||ERROR in task::run. What: cuDF failure at: ../src/io/csv/reader_impl.cu:212: byte_range offset with header not supported||||| [19:37:44.175] [error] |||ERROR in graph::execute. What: cuDF failure at: ../src/io/csv/reader_impl.cu:212: byte_range offset with header not supported||||| [19:37:44.258] [error] |||ERROR in task::run. What: cuDF failure at: ../src/io/csv/reader_impl.cu:212: byte_range offset with header not supported||||| [19:37:44.426] [error] |||ERROR in task::run. What: cuDF failure at: ../src/io/csv/reader_impl.cu:212: byte_range offset with header not supported||||| [19:37:44.513] [error] |||ERROR in task::run. What: cuDF failure at: ../src/io/csv/reader_impl.cu:212: byte_range offset with header not supported||||| [19:37:44.718] [error] |||ERROR in task::run. What: cuDF failure at: ../src/io/csv/reader_impl.cu:212: byte_range offset with header not supported||||| [19:37:44.917] [error] |||ERROR in task::run. What: cuDF failure at: ../src/io/csv/reader_impl.cu:212: byte_range offset with header not supported||||| [19:37:45.117] [error] |||ERROR in task::run. What: cuDF failure at: ../src/io/csv/reader_impl.cu:212: byte_range offset with header not supported||||| [19:37:45.308] [error] |||ERROR in task::run. What: cuDF failure at: ../src/io/csv/reader_impl.cu:212: byte_range offset with header not supported||||| [19:37:45.508] [error] |||ERROR in task::run. What: cuDF failure at: ../src/io/csv/reader_impl.cu:212: byte_range offset with header not supported||||| [19:37:45.698] [error] |||ERROR in task::run. What: cuDF failure at: ../src/io/csv/reader_impl.cu:212: byte_range offset with header not supported||||| [19:37:45.868] [error] |||ERROR in task::run. What: cuDF failure at: ../src/io/csv/reader_impl.cu:212: byte_range offset with header not supported||||| [19:37:46.074] [error] |||ERROR in task::run. What: cuDF failure at: ../src/io/csv/reader_impl.cu:212: byte_range offset with header not supported||||| [19:37:46.232] [error] |||ERROR in task::run. What: cuDF failure at: ../src/io/csv/reader_impl.cu:212: byte_range offset with header not supported||||| [19:37:46.444] [error] |||ERROR in task::run. What: cuDF failure at: ../src/io/csv/reader_impl.cu:212: byte_range offset with header not supported||||| [19:37:46.655] [error] |||ERROR in task::run. What: cuDF failure at: ../src/io/csv/reader_impl.cu:212: byte_range offset with header not supported||||| [19:37:46.837] [error] |||ERROR in task::run. What: cuDF failure at: ../src/io/csv/reader_impl.cu:212: byte_range offset with header not supported||||| [19:37:47.028] [error] |||ERROR in task::run. What: cuDF failure at: ../src/io/csv/reader_impl.cu:212: byte_range offset with header not supported||||| [19:37:47.227] [error] |||ERROR in task::run. What: cuDF failure at: ../src/io/csv/reader_impl.cu:212: byte_range offset with header not supported||||| [19:37:47.417] [error] |||ERROR in task::run. What: cuDF failure at: ../src/io/csv/reader_impl.cu:212: byte_range offset with header not supported||||| [19:37:47.586] [error] |||ERROR in task::run. What: cuDF failure at: ../src/io/csv/reader_impl.cu:212: byte_range offset with header not supported||||| [19:37:47.642] [error] |||ERROR in task::run. What: cuDF failure at: ../src/io/csv/reader_impl.cu:212: byte_range offset with header not supported||||| [19:37:47.662] [error] |||ERROR in graph::execute. What: cuDF failure at: ../src/io/csv/reader_impl.cu:212: byte_range offset with header not supported||||| [19:37:47.663] [error] 573370665|1|1|In MergeAggregate kernel for MergeAggregate(group=[{}], EXPR$0=[$SUM0($0)], agg#1=[COUNT($0)]). What: cuDF failure at: ../src/io/csv/reader_impl.cu:212: byte_range offset with header not supported||||| [19:37:47.663] [error] |||ERROR in graph::execute. What: cuDF failure at: ../src/io/csv/reader_impl.cu:212: byte_range offset with header not supported||||| [19:37:47.663] [error] |||ERROR in graph::execute. What: Ral failure at: /opt/conda/envs/rapids/conda-bld/blazingsql_1633567369093/work/engine/src/execution_kernels/BatchProcessing.cpp:392: ERROR: Projection::run() first input CacheData was nullptr||||| [19:37:47.670] [error] 573370665|||In get_execute_graph_results. What: cuDF failure at: ../src/io/csv/reader_impl.cu:212: byte_range offset with header not supported|||||

Steps/Code to reproduce bug

import os from pathlib import Path

Just needed on my computer.

os.environ["CONDA_PREFIX"] = '/data/astro/scratch/eriksen/miniconda3/envs/blazing'

import blazingsql from blazingsql import BlazingContext

d = Path('/data/astro/scratch/eriksen/kaggle/competitions/new-york-city-taxi-fare-prediction') bc = BlazingContext()

bc.create_table('train', str(d / 'train.csv')) gdf = bc.sql('select * from train limit 100')

Expected behavior

Print a number, the average taxi fare. If there being a problem with the input it should fail gracefully.

Environment overview (please complete the following information)

  • bare-metal
  • installed in a new conda environment.

BlazingSQL version (git hash): 2a4a99cc83c4b8a52078cba2b8d6c80194cb3a78 BlazingSQL branch name: HEAD BlazingSQL branch tag: v21.10.00 BlazingSQL build id: 0 BlazingSQL compiler version: GNU /usr/local/gcc9/bin/g++ 9.4.0 BlazingSQL cuda flags: -Xcompiler -Wno-parentheses -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_75,code=compute_75 --expt-extended-lambda --expt-relaxed-constexpr -Werror=cross-execution-space-call -Xcompiler -Wall,-Wno-error=deprecated-declarations --default-stream=per-thread -DHT_DEFAULT_ALLOCATOR BlazingSQL Operating system kernel: Linux-5.8.0-1042-aws BlazingSQL Operating system architecture: x86_64 BlazingSQL Linux Operating system release: NAME=CentOS Linux|VERSION=7 (Core)|ID=centos|ID_LIKE=rhel fedora|VERSION_ID=7|PRETTY_NAME=CentOS Linux 7 (Core)|ANSI_COLOR=031|CPE_NAME=cpe:/o:centos:centos:7|HOME_URL=https://www.centos.org/|BUG_REPORT_URL=https://bugs.centos.org/||CENTOS_MANTISBT_PROJECT=CentOS-7|CENTOS_MANTISBT_PROJECT_VERSION=7|REDHAT_SUPPORT_PRODUCT=centos|REDHAT_SUPPORT_PRODUCT_VERSION=7| None

Environment details Please run and paste the output of the print_env.sh script here, to gather any other relevant environment details

Not sure where I am expect to find this. I include a list of packages in the conda environment: (blazing) [eriksen@gpu01 bin]$ conda list

packages in environment at /data/astro/scratch/eriksen/miniconda3/envs/blazing:

Name Version Build Channel

_libgcc_mutex 0.1 conda_forge conda-forge _openmp_mutex 4.5 1_gnu conda-forge abseil-cpp 20210324.2 h9c3ff4c_0 conda-forge argon2-cffi 21.1.0 py38h497a2fe_2 conda-forge arrow-cpp 5.0.0 py38h327e1ba_4_cuda conda-forge arrow-cpp-proc 3.0.0 cuda conda-forge async_generator 1.10 py_0 conda-forge attrs 21.2.0 pyhd8ed1ab_0 conda-forge aws-c-cal 0.5.11 h95a6274_0 conda-forge aws-c-common 0.6.2 h7f98852_0 conda-forge aws-c-event-stream 0.2.7 h3541f99_13 conda-forge aws-c-io 0.10.5 hfb6a706_0 conda-forge aws-checksums 0.1.11 ha31a3da_7 conda-forge aws-sdk-cpp 1.8.186 hb4091e7_3 conda-forge backcall 0.2.0 pyh9f0ad1d_0 conda-forge backports 1.0 py_2 conda-forge backports.functools_lru_cache 1.6.4 pyhd8ed1ab_0 conda-forge blazingsql 21.10.0 pypi_0 pypi bleach 4.1.0 pyhd8ed1ab_0 conda-forge bokeh 2.4.2 py38h578d9bd_0 conda-forge boost-cpp 1.72.0 h359cf19_6 conda-forge brotlipy 0.7.0 py38h497a2fe_1003 conda-forge bzip2 1.0.8 h7f98852_4 conda-forge c-ares 1.18.1 h7f98852_0 conda-forge ca-certificates 2021.10.8 ha878542_0 conda-forge cachetools 4.2.4 pyhd8ed1ab_0 conda-forge certifi 2021.10.8 py38h578d9bd_1 conda-forge cffi 1.15.0 py38h3931269_0 conda-forge charset-normalizer 2.0.9 pyhd8ed1ab_0 conda-forge click 8.0.3 py38h578d9bd_1 conda-forge cloudpickle 2.0.0 pyhd8ed1ab_0 conda-forge colorama 0.4.4 pyh9f0ad1d_0 conda-forge cryptography 36.0.0 py38h3e25421_0 conda-forge cudatoolkit 11.4.2 h00f7ccd_9 conda-forge cudf 21.10.01 cuda_11.4_py38_ga1d2d13a14_0 rapidsai cupy 9.3.0 py38ha96c4f3_0 rapidsai cytoolz 0.11.2 py38h497a2fe_1 conda-forge dask 2021.9.1 pyhd8ed1ab_0 conda-forge dask-core 2021.9.1 pyhd8ed1ab_0 conda-forge dask-cuda 21.10.00 py38_0 rapidsai dask-cudf 21.10.01 py38_ga1d2d13a14_0 rapidsai debugpy 1.5.1 py38h709712a_0 conda-forge decorator 5.1.0 pyhd8ed1ab_0 conda-forge defusedxml 0.7.1 pyhd8ed1ab_0 conda-forge distributed 2021.9.1 py38h578d9bd_0 conda-forge dlpack 0.5 h9c3ff4c_0 conda-forge entrypoints 0.3 pyhd8ed1ab_1003 conda-forge fastavro 1.4.7 py38h497a2fe_1 conda-forge fastrlock 0.8 py38h709712a_1 conda-forge freetype 2.10.4 h0708190_1 conda-forge fsspec 2021.11.1 pyhd8ed1ab_0 conda-forge future 0.18.2 py38h578d9bd_4 conda-forge gflags 2.2.2 he1b5a44_1004 conda-forge glog 0.5.0 h48cff8f_0 conda-forge google-cloud-cpp 1.29.0 hb967e95_1 conda-forge greenlet 1.1.2 py38h709712a_1 conda-forge grpc-cpp 1.39.1 h850795e_1 conda-forge heapdict 1.0.1 py_0 conda-forge icu 69.1 h9c3ff4c_0 conda-forge idna 3.1 pyhd3deb0d_0 conda-forge importlib-metadata 4.8.2 py38h578d9bd_0 conda-forge importlib_resources 5.4.0 pyhd8ed1ab_0 conda-forge ipykernel 6.6.0 py38he5a9106_0 conda-forge ipython 7.30.1 py38h578d9bd_0 conda-forge ipython_genutils 0.2.0 py_1 conda-forge ipywidgets 7.6.5 pyhd8ed1ab_0 conda-forge jbig 2.1 h7f98852_2003 conda-forge jedi 0.18.1 py38h578d9bd_0 conda-forge jinja2 3.0.3 pyhd8ed1ab_0 conda-forge jpeg 9d h36c2ea0_0 conda-forge jpype1 1.3.0 py38h1fd1430_2 conda-forge jsonschema 4.2.1 pyhd8ed1ab_1 conda-forge jupyter_client 7.1.0 pyhd8ed1ab_0 conda-forge jupyter_core 4.9.1 py38h578d9bd_1 conda-forge jupyterlab_pygments 0.1.2 pyh9f0ad1d_0 conda-forge jupyterlab_widgets 1.0.2 pyhd8ed1ab_0 conda-forge krb5 1.19.2 hcc1bbae_3 conda-forge lcms2 2.12 hddcbb42_0 conda-forge ld_impl_linux-64 2.36.1 hea4e1c9_2 conda-forge lerc 3.0 h9c3ff4c_0 conda-forge libblas 3.9.0 12_linux64_openblas conda-forge libbrotlicommon 1.0.9 h7f98852_6 conda-forge libbrotlidec 1.0.9 h7f98852_6 conda-forge libbrotlienc 1.0.9 h7f98852_6 conda-forge libcblas 3.9.0 12_linux64_openblas conda-forge libcrc32c 1.1.2 h9c3ff4c_0 conda-forge libcudf 21.10.01 cuda11.4_ga1d2d13a14_0 rapidsai libcurl 7.80.0 h2574ce0_0 conda-forge libdeflate 1.8 h7f98852_0 conda-forge libedit 3.1.20191231 he28a2e2_2 conda-forge libev 4.33 h516909a_1 conda-forge libevent 2.1.10 h9b69904_4 conda-forge libffi 3.4.2 h7f98852_5 conda-forge libgcc-ng 11.2.0 h1d223b6_11 conda-forge libgfortran-ng 11.2.0 h69a702a_11 conda-forge libgfortran5 11.2.0 h5c6108e_11 conda-forge libgomp 11.2.0 h1d223b6_11 conda-forge libhwloc 2.3.0 h5e5b7d1_1 conda-forge libiconv 1.16 h516909a_0 conda-forge liblapack 3.9.0 12_linux64_openblas conda-forge libllvm10 10.0.1 he513fc3_3 conda-forge libnghttp2 1.43.0 h812cca2_1 conda-forge libnsl 2.0.0 h7f98852_0 conda-forge libopenblas 0.3.18 pthreads_h8fe5266_0 conda-forge libpng 1.6.37 h21135ba_2 conda-forge libpq 13.5 hd57d9b9_1 conda-forge libprotobuf 3.16.0 h780b84a_0 conda-forge librmm 21.10.01 cuda11.4_gc54767f_0 rapidsai libsodium 1.0.18 h36c2ea0_1 conda-forge libssh2 1.10.0 ha56f1ee_2 conda-forge libstdcxx-ng 11.2.0 he4da1e4_11 conda-forge libthrift 0.14.2 he6d91bd_1 conda-forge libtiff 4.3.0 h6f004c6_2 conda-forge libutf8proc 2.6.1 h7f98852_0 conda-forge libwebp-base 1.2.1 h7f98852_0 conda-forge libxml2 2.9.12 h885dcf4_1 conda-forge libzlib 1.2.11 h36c2ea0_1013 conda-forge llvmlite 0.36.0 py38h4630a5e_0 conda-forge locket 0.2.0 py_2 conda-forge lz4-c 1.9.3 h9c3ff4c_1 conda-forge markupsafe 2.0.1 py38h497a2fe_1 conda-forge matplotlib-inline 0.1.3 pyhd8ed1ab_0 conda-forge mistune 0.8.4 py38h497a2fe_1005 conda-forge msgpack-python 1.0.3 py38h1fd1430_0 conda-forge nbclient 0.5.9 pyhd8ed1ab_0 conda-forge nbconvert 6.3.0 py38h578d9bd_1 conda-forge nbformat 5.1.3 pyhd8ed1ab_0 conda-forge ncurses 6.2 h58526e2_4 conda-forge nest-asyncio 1.5.4 pyhd8ed1ab_0 conda-forge netifaces 0.10.9 py38h497a2fe_1004 conda-forge nlohmann_json 3.9.1 h9c3ff4c_1 conda-forge notebook 6.4.6 pyha770c72_0 conda-forge numba 0.53.1 py38h8b71fd7_1 conda-forge numpy 1.21.4 py38he2449b9_0 conda-forge nvtx 0.2.3 py38h497a2fe_1 conda-forge olefile 0.46 pyh9f0ad1d_1 conda-forge openjdk 8.0.312 h7f98852_0 conda-forge openjpeg 2.4.0 hb52868f_1 conda-forge openssl 1.1.1l h7f98852_0 conda-forge orc 1.6.10 h58a87f1_0 conda-forge packaging 21.3 pyhd8ed1ab_0 conda-forge pandas 1.3.4 py38h43a58ef_1 conda-forge pandoc 2.16.2 h7f98852_0 conda-forge pandocfilters 1.5.0 pyhd8ed1ab_0 conda-forge parquet-cpp 1.5.1 2 conda-forge parso 0.8.3 pyhd8ed1ab_0 conda-forge partd 1.2.0 pyhd8ed1ab_0 conda-forge pexpect 4.8.0 pyh9f0ad1d_2 conda-forge pickleshare 0.7.5 py_1003 conda-forge pillow 8.4.0 py38h8e6f84c_0 conda-forge pip 21.3.1 pyhd8ed1ab_0 conda-forge prometheus_client 0.12.0 pyhd8ed1ab_0 conda-forge prompt-toolkit 3.0.24 pyha770c72_0 conda-forge protobuf 3.16.0 py38h709712a_0 conda-forge psutil 5.8.0 py38h497a2fe_2 conda-forge ptyprocess 0.7.0 pyhd3deb0d_0 conda-forge pure-sasl 0.6.2 pyhd8ed1ab_0 conda-forge pyarrow 5.0.0 py38hed47224_4_cuda conda-forge pycparser 2.21 pyhd8ed1ab_0 conda-forge pygments 2.10.0 pyhd8ed1ab_0 conda-forge pyhive 0.6.4 pyhd8ed1ab_0 conda-forge pynvml 11.4.1 pyhd8ed1ab_0 conda-forge pyopenssl 21.0.0 pyhd8ed1ab_0 conda-forge pyparsing 3.0.6 pyhd8ed1ab_0 conda-forge pyrsistent 0.18.0 py38h497a2fe_0 conda-forge pysocks 1.7.1 py38h578d9bd_4 conda-forge python 3.8.12 hb7a2778_2_cpython conda-forge python-dateutil 2.8.2 pyhd8ed1ab_0 conda-forge python_abi 3.8 2_cp38 conda-forge pytz 2021.3 pyhd8ed1ab_0 conda-forge pyyaml 6.0 py38h497a2fe_3 conda-forge pyzmq 22.3.0 py38h2035c66_1 conda-forge re2 2021.09.01 h9c3ff4c_0 conda-forge readline 8.1 h46c0cb4_0 conda-forge requests 2.26.0 pyhd8ed1ab_1 conda-forge rmm 21.10.01 cuda_11.4_py38_gc54767f_0 rapidsai s2n 1.0.10 h9b69904_0 conda-forge send2trash 1.8.0 pyhd8ed1ab_0 conda-forge setuptools 59.4.0 py38h578d9bd_0 conda-forge six 1.16.0 pyh6c4a22f_0 conda-forge snappy 1.1.8 he1b5a44_3 conda-forge sortedcontainers 2.4.0 pyhd8ed1ab_0 conda-forge spdlog 1.8.5 h4bd325d_0 conda-forge sqlalchemy 1.4.28 py38h497a2fe_0 conda-forge sqlite 3.37.0 h9cd32fc_0 conda-forge tblib 1.7.0 pyhd8ed1ab_0 conda-forge terminado 0.12.1 py38h578d9bd_1 conda-forge testpath 0.5.0 pyhd8ed1ab_0 conda-forge thrift 0.15.0 py38h709712a_1 conda-forge thrift_sasl 0.4.3 pyhd8ed1ab_1 conda-forge tk 8.6.11 h27826a3_1 conda-forge toolz 0.11.2 pyhd8ed1ab_0 conda-forge tornado 6.1 py38h497a2fe_2 conda-forge tqdm 4.62.3 pyhd8ed1ab_0 conda-forge traitlets 5.1.1 pyhd8ed1ab_0 conda-forge typing_extensions 4.0.1 pyha770c72_0 conda-forge ucx 1.11.2+gef2bbcf cuda11.2_0 rapidsai ucx-proc 1.0.0 gpu rapidsai ucx-py 0.22.01 py38_gef2bbcf_33 rapidsai urllib3 1.26.7 pyhd8ed1ab_0 conda-forge wcwidth 0.2.5 pyh9f0ad1d_2 conda-forge webencodings 0.5.1 py_1 conda-forge wheel 0.37.0 pyhd8ed1ab_1 conda-forge widgetsnbextension 3.5.2 py38h578d9bd_1 conda-forge xz 5.2.5 h516909a_1 conda-forge yaml 0.2.5 h516909a_0 conda-forge zeromq 4.3.4 h9c3ff4c_1 conda-forge zict 2.0.0 py_0 conda-forge zipp 3.6.0 pyhd8ed1ab_0 conda-forge zlib 1.2.11 h36c2ea0_1013 conda-forge zstd 1.5.0 ha95c52a_0 conda-forge

Additional context Adding some information on the GPU, since this error is happening in a CUDA file.

(blazing) [eriksen@gpu01 bin]$ nvidia-smi Fri Dec 10 19:43:08 2021
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 450.51.05 Driver Version: 450.51.05 CUDA Version: 11.0 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 GeForce RTX 208... Off | 00000000:3D:00.0 Off | N/A | | 31% 32C P8 19W / 250W | 3MiB / 11019MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+

----For BlazingSQL Developers---- Suspected source of the issue Where and what are potential sources of the issue

Other design considerations What components of the engine could be affected by this?

marberi avatar Dec 10 '21 18:12 marberi