[Python] Timestamp - out of bounds for nanoseconds
Describe the bug, including details regarding any error messages, version, and platform.
Environment OS: Windows/Linux Python: 3.11.2 Pyarrow: 17.0.0 Pandas: 2.2.2
Description When trying to read a timestamp value, below the pandas min. value of 1677-09-21 00:12:43.145224193, from a datetime object into a pyarrow table, the result is an out of bounds for nanoseconds exception.
I have found problems that might relate but did not solve the issue here
Example Code
import pyarrow as pa
import datetime
schema = pa.schema([])
schema = schema.append(pa.field("CreateAt", pa.timestamp(unit="ns")))
ts = datetime.datetime(1677, 9, 21, 1) # OK
arrays = [[ts]]
print(arrays)
table = pa.Table.from_arrays(arrays, schema=schema)
print(table)
ts = datetime.datetime(1, 1, 1, 1) # NoK
arrays = [[ts]]
print(arrays)
table = pa.Table.from_arrays(arrays, schema=schema)
print(table)
Use Case
I am reading data from a database, where one column has ns precision timestamps. Instead of null values, it uses 0001-01-01 00:00:00.0000000. The goal is to store the result of the database read, which is an array containing datetime objects, into a Pyarrow table to then store it as parquet. This works well, until i hit a timestamp too big or small for pandas.
Component(s)
Python