DateRange Timestamp conversion on windows fails with timestamps close to EPOCH
Expected Behavior
When you use DateRange with starting date before 1970 (EPOCH) it raise OSError.
OSError: [Errno 22] Invalid argument
Linking the bug ticket from Python
https://bugs.python.org/issue37527
Current Behavior
Currently it works as intended for any non Windows OS . The work around is to provide the datetime with a timezone utc.
Steps to Reproduce (for bugs)
This code will fail and raise OSError on windows.
testDataSpec = (
dg.DataGenerator( spark, name="test_data_set1", rows=1000 partitions=4)
.withColumn(
"purchase_date",
"date",
data_range=dg.DateRange("1910-10-01 00:00:00", "1950-10-06 11:55:00", "days=3"),
random=True,
)
)
Context
Your Environment
-
dbldatagenversion used: - Databricks Runtime version:
- Cloud environment used:
Can you provide more details of the workaround? If there's a valid workaround, we will document it but as intended runtime environment is Databricks cloud environment and it is tested under cloud environment and local Linux or similar environment, we cannot validate it.
While we don't block it running on other environments, the intent is to support it running on a Databricks cloud environment or developing locally in preparation for use on a Databricks cloud environment.
The folllowing example shows use of DateTime instances to define the range:
import dbldatagen as dg
from datetime import datetime, timezone
startingTime = datetime.fromisoformat("1910-10-01T00:00:00").replace(tzinfo=timezone.utc)
endingTime = datetime.fromisoformat("1950-10-06T11:55:00").replace(tzinfo=timezone.utc)
testDataSpec = (
dg.DataGenerator( spark, name="test_data_set1", rows=1000, partitions=4)
.withColumn(
"purchase_date",
"date",
data_range=dg.DateRange(startingTime, endingTime, "days=3"),
random=True,
)
)
display(testDataSpec.build())
Sorry for the late answer. Yes it's what I use as workaround or also this one works too.
from datetime import datetime, timezone
now = datetime.now()
utc_now = now.astimezone(timezone.utc)