redisgraph-bulk-loader icon indicating copy to clipboard operation
redisgraph-bulk-loader copied to clipboard

CSV max size OverflowError on Windows

Open soof-golan opened this issue 3 years ago • 4 comments

I've tried to import redisgraph_bulk_loader.bulk_insert but failed with the following error

File ~\dev\redis-graph-poc\venv\lib\site-packages\redisgraph_bulk_loader\entity_file.py:11, in <module>
      8 from enum import Enum
      9 from exceptions import CSVError, SchemaError
---> 11 csv.field_size_limit(sys.maxsize) # Don't limit the size of user input fields.
     14 class Type(Enum):
     15     UNKNOWN = 0

OverflowError: Python int too large to convert to C long

System:

  • x86_64 Windows 10
  • Python3.9

soof-golan avatar Apr 27 '22 09:04 soof-golan

@soof-golan How many rows is your spreadsheet? Also - how many columns? A ballpark is fine!

chayim avatar Apr 27 '22 11:04 chayim

Haven't even loaded a CSV, python just fails on import time because of sys.maxsize and the csv module plan is to ingest approx 200M nodes and approx 2B edges

soof-golan avatar Apr 27 '22 11:04 soof-golan

Same issue here. Throws this even without specifying any arguments. Throws the same exception with arguments, even with a modest 12mb file.

hypdeb avatar Aug 08 '22 18:08 hypdeb

【environment】

  • 64-bit operating system, x64-based processor
  • windows 10 home
  • Python3.9.7

【Conclusion】 It worked if I commented out csv.field_size_limit(sys.maxsize) or changed it to csv.field_size_limit(2147483647).

Regarding sys.maxsize, sys.maxsize = 2**31-1 on Linux in a 32bit environment, so I think it will work. However, on Windows in a 64bit environment, sys.maxsize = 2**63-1, so it was an OverflowError: Python int too large to convert to C long error. I felt that the maximum field size that can be newly set in csv.field_size_limit () is 2**31-1 in the current specifications.

Thank you

H16C3009 avatar Dec 18 '22 16:12 H16C3009