PynamoDB
PynamoDB copied to clipboard
Getting a ValueError when deserializing a valid UTCTimeDateAttribute
Hello,
I'm getting stuck with a ValueError when i use the get() method on the following model:
class Job(Model):
class Meta:
table_name = environ.get('TABLE_NAME')
aws_access_key_id = environ.get('AWS_ACCESS_KEY_ID')
aws_secret_access_key = environ.get('AWS_SECRET_ACCESS_KEY')
region = 'eu-west-3'
id = NumberAttribute(hash_key=True)
name = UnicodeAttribute()
tenant = UnicodeAttribute()
query = UnicodeAttribute()
schedule = UnicodeAttribute()
active = BooleanAttribute()
created_by = UnicodeAttribute()
modified_by = UnicodeAttribute()
creation_date = UTCDateTimeAttribute()
modification_date = UTCDateTimeAttribute()
the item I'm trying to retrieve has the following values:
{
"id": {
"N": "1"
},
"tenant": {
"S": "TRN"
},
"active": {
"BOOL": true
},
"schedule": {
"S": "every monday"
},
"query": {
"S": "select * from *"
},
"created_by": {
"S": "Lucas"
},
"modification_date": {
"S": "2021-09-06 17:04:21.899683"
},
"modified_by": {
"S": "Lucas"
},
"name": {
"S": "Test Job 1"
},
"creation_date": {
"S": "2021-09-06 17:04:21.899683"
}
}
But i get the following stack trace:
File "/mnt/c/Users/l.pierru/Documents/Focal Microservices/infor-data-lake-to-s3/test_scan.py", line 55, in <module>
job = Job.get(2)
File "/home/lpierru/.local/share/virtualenvs/infor-data-lake-to-s3-_H1zLWyn/lib/python3.9/site-packages/pynamodb/models.py", line 542, in get
return cls.from_raw_data(item_data)
File "/home/lpierru/.local/share/virtualenvs/infor-data-lake-to-s3-_H1zLWyn/lib/python3.9/site-packages/pynamodb/models.py", line 556, in from_raw_data
return cls._instantiate(data)
File "/home/lpierru/.local/share/virtualenvs/infor-data-lake-to-s3-_H1zLWyn/lib/python3.9/site-packages/pynamodb/attributes.py", line 400, in _instantiate
AttributeContainer._container_deserialize(instance, attribute_values)
File "/home/lpierru/.local/share/virtualenvs/infor-data-lake-to-s3-_H1zLWyn/lib/python3.9/site-packages/pynamodb/attributes.py", line 380, in _container_deserialize
value = attr.deserialize(attr.get_value(attribute_value))
File "/home/lpierru/.local/share/virtualenvs/infor-data-lake-to-s3-_H1zLWyn/lib/python3.9/site-packages/pynamodb/attributes.py", line 697, in deserialize
return self._fast_parse_utc_date_string(value)
File "/home/lpierru/.local/share/virtualenvs/infor-data-lake-to-s3-_H1zLWyn/lib/python3.9/site-packages/pynamodb/attributes.py", line 717, in _fast_parse_utc_date_string
raise ValueError("Datetime string '{}' does not match format '{}'".format(date_string, DATETIME_FORMAT))
ValueError: Datetime string '000002021-09-06 17:31:31.429277' does not match format '%Y-%m-%dT%H:%M:%S.%f%z`
I already tried to create the item multiple times, checking for hidden characters and such. I don't know where the five zeroes before the year come from. I can't see those when using the AWS CLI or the website editor so I think it's coming from the pynamodb module.
Any help on this would be appreciated.
It doesn't have T in the middle nor a timezone. Not sure what encoded this value but perhaps it's best if you just map it up UnicodeAttribute?
Well spotted! I didn't see the T was missing in the format :)
The issue is still the same though, updated Traceback:
Traceback (most recent call last):
File "/mnt/c/Users/l.pierru/Documents/Focal Microservices/infor-data-lake-to-s3/test_scan.py", line 41, in <module>
for job in Job.batch_get(ids):
File "/home/lpierru/.local/share/virtualenvs/infor-data-lake-to-s3-_H1zLWyn/lib/python3.9/site-packages/pynamodb/models.py", line 367, in batch_get
yield cls.from_raw_data(batch_item)
File "/home/lpierru/.local/share/virtualenvs/infor-data-lake-to-s3-_H1zLWyn/lib/python3.9/site-packages/pynamodb/models.py", line 556, in from_raw_data
return cls._instantiate(data)
File "/home/lpierru/.local/share/virtualenvs/infor-data-lake-to-s3-_H1zLWyn/lib/python3.9/site-packages/pynamodb/attributes.py", line 400, in _instantiate
AttributeContainer._container_deserialize(instance, attribute_values)
File "/home/lpierru/.local/share/virtualenvs/infor-data-lake-to-s3-_H1zLWyn/lib/python3.9/site-packages/pynamodb/attributes.py", line 380, in _container_deserialize
value = attr.deserialize(attr.get_value(attribute_value))
File "/home/lpierru/.local/share/virtualenvs/infor-data-lake-to-s3-_H1zLWyn/lib/python3.9/site-packages/pynamodb/attributes.py", line 697, in deserialize
return self._fast_parse_utc_date_string(value)
File "/home/lpierru/.local/share/virtualenvs/infor-data-lake-to-s3-_H1zLWyn/lib/python3.9/site-packages/pynamodb/attributes.py", line 717, in _fast_parse_utc_date_string
raise ValueError("Datetime string '{}' does not match format '{}'".format(date_string, DATETIME_FORMAT))
ValueError: Datetime string '000002021-09-07T10:04:58.728100' does not match format '%Y-%m-%dT%H:%M:%S.%f%z'
Still no timezone. Also year is padded with zeroes. What are your serializing this with? Why do you want to serialize it with UTCDateTimeAttribute? You can implement your own attribute too to parse whatever format you serialized.
Getting the same error with UTCDateTimeAttribute
:
ValueError: Datetime string '00000002022-02-26T18:21:33.034Z' does not match format '%Y-%m-%dT%H:%M:%S.%f%z'
The DynamoDB table has these values stored as strings in created_at_utc
field:
2022-02-26T18:21:33.034Z
The formatting generated by the Python script is exactly the same as the modified_at_utc
column which contains a timestamp generated by Step Functions with $$.Execution.StartTime
:
# $$.Execution.StartTime timestamp
2022-02-26T18:21:57.318Z
For some reason the extra zeros are being added.
I just had this issue as well while migrating similar items into an existing pynamodb model. Not an ideal fix but I was able to do the following to get around it.
import dateutil.parser
import json
# item is a dict representing keys and values that match up to the attributes in my pynamodb model
new_timestamp = dateutil.parser.parse(item['timestamp'])
item.pop('timestamp')
# this line used to error because of the timestamp
# now that I've popped timestamp off it doesn't
thing = Thing(**item)
# then I can put timestamp back on after from_json runs without error
thing.timestamp = new_timestamp
thing.save()
I have similar problem:
ValueError: Datetime string '2022-05-09T12:04:56.479663+00:00' does not match format '%Y-%m-%dT%H:%M:%S.%f%z'
So I'm getting this error with a valid ISO 8601 format 🤯
I am getting the same when using from_raw_data it adds some extra zeros when convert the date to pynamodb's model
In each case it seems that the data has been populated in DynamoDB through some other means than PynamoDB's UTCTimeDateAttribute
. We probably don't want to make a flexible parser to handle what's effectively a data corruption.
You might try to override deserialize
on your model and once all data is converted to the "good" state, remove the override from your code:
def deserialize(self, values: Any) -> Any:
if values is not None:
if has_bad_format(values['my_time_attr']['S']):
values['my_time_attr']['S'] = convert_bad_to_good(values['my_time_attr']['S'])
return super().deserialize(values)
I'm just trying out pynamodb but this immediately caught me offguard, as I use a 2021-10-07T17:01:04.016Z
type format for all my datetime fields. I ended up writing my own custom attribute, which was surprisingly easy to do. It's not the most efficient, but works well for me:
from datetime import datetime, timezone
from dateutil.parser import parse
class AWSISODateTimeAttribute(Attribute[datetime]):
attr_type = STRING
def serialize(self, value: Union[datetime,str]) -> str:
t: datetime = None
if isinstance(value, str):
t = parse(value)
else:
t = value
if t.tzinfo is None:
t = t.replace(tzinfo=timezone.utc)
return t.astimezone(timezone.utc).isoformat(timespec='milliseconds').replace("+00:00", "Z")
def deserialize(self, value: str) -> str:
return value