Wrong local file timestamp when using Europe/Dublin host timezone
Confirm by changing [ ] to [x] below to ensure that it's a bug:
- [x] I've gone though the User Guide and the API reference
- [x] I've searched for previous similar issues and didn't find any solution
Describe the bug Files copied from S3 to local (either with cp or sync) have wrong local timestamp (1 extra hour) in system with Europe/Dublin timezone set. Issue presents itself only when daylight saving time is not observed.
Platform/OS/Hardware/Device aws-cli/1.18.114 Python/2.7.12 Linux/4.15.0-142-generic botocore/1.17.37/ Ubuntu 16.04 aws-cli/1.21.9 Python/3.8.10 Linux/5.4.72-microsoft-standard-WSL2 botocore/1.22.9 WSL2 Ubuntu 20.04
To Reproduce (observed behavior) Set local timezone to Europe/Dublin and copy or sync file(s) from S3 to local.
Expected behavior Timestamp of copied files should be the same as stored objects on S3, because Dublin timezone currently uses GMT, which has no offset compared to UTC.
Logs/output
When using sync, relevant file contents from S3:
<Contents><Key>my_file</Key><LastModified>2021-12-02T17:42:01.000Z</LastModified>
Sync command trying to sync the file again, due to different timestamp detected:
2021-12-03 13:24:01,986 - MainThread - awscli.customizations.s3.syncstrategy.base - DEBUG - syncing: my_bucket/my_file -> /home/aaversa/Documents/test_s3_sync/my_file, size: 3847 -> 3847, modified time: 2021-12-02 18:42:01+01:00 -> 2021-12-02 19:42:01+01:00
Created local file has timestamp 18:42:01 instead of expected 17:42:01.
Additional context When using Europe/London timezone the issue doesn't appear.
I think the Ireland (Eire) (reverse?) definition of "standard time" being winter at GMT and summer being UT +01 instead of the usual summer being "standard time" at UT +01 and winter DST being GMT confuses several timezone libraries that don't expect a negative SAVE offset, see:
- https://github.com/eggert/tz/blob/main/europe#L341-L344
- https://github.com/eggert/tz/blob/main/europe#L512-L557
Thanks for the post, @avemar, and sorry to hear you're having trouble.
Also, thanks for the info @jnerin!
I'm having trouble reproducing. I'm doing the following, and copy and sync set the correct time on an Amazon Linux 2 EC2 as well as my Mac:
> export TZ=Europe/Dublin
> aws s3 sync s3://my-bucket/testobjects/ /tmp/
I'm wondering about this statement:
Issue presents itself only when daylight saving time is observed.
Does that mean when you performed the sync now, or when you initially put the object in S3? Dublin is not currently under DST, correct?
Thank you for taking the time @kdaily .
Issue presents itself only when daylight saving time is observed.
I meant exactly the opposite, my bad. I updated the original comment in order to reflect this. The issue started only as soon as Ireland moved from DST to regular (winter) GMT. So object created during GMT (with correct timestamp stored in S3 as UTC), but sync and cp acting up only if local machine has Europe/Dublin timezone set and it's not DST. You're correct, Dublin is not on DST currently.
Also, not sure if it's relevant, but on Ubuntu I used timedatectl in order to perform tests:
timedatectl set-timezone Europe/Dublin.
Thanks for clarifying! I switched to using timedatectl but on an Amazon Linux 2 instance for now. Here's what mine looks like right now:
[ec2-user@ip-172-31-13-0 ~]$ sudo timedatectl set-timezone Europe/Dublin
[ec2-user@ip-172-31-13-0 ~]$ timedatectl
Local time: Thu 2021-12-09 00:34:07 GMT
Universal time: Thu 2021-12-09 00:34:07 UTC
RTC time: Thu 2021-12-09 00:34:07
Time zone: Europe/Dublin (GMT, +0000)
NTP enabled: yes
NTP synchronized: yes
RTC in local TZ: no
DST active: no
Last DST change: DST ended at
Sun 2021-10-31 01:59:59 IST
Sun 2021-10-31 01:00:00 GMT
Next DST change: DST begins (the clock jumps one hour forward) at
Sun 2022-03-27 00:59:59 GMT
Sun 2022-03-27 02:00:00 IST
Can you show me what yours is like?
I think the debug line hints at the issue: the dates are shown there with an offset of +1:00, and this is what the modified time is getting set to:
2021-12-03 13:24:01,986 - MainThread - awscli.customizations.s3.syncstrategy.base - DEBUG - syncing: my_bucket/my_file -> /home/aaversa/Documents/test_s3_sync/my_file, size: 3847 -> 3847, modified time: 2021-12-02 18:42:01+01:00 -> 2021-12-02 19:42:01+01:00
This shows that the LastModified time of the S3 object in your system's timezone is 2021-12-02 18:42:01+01:00, and it's going to sync it because it's older than the modified file time of the local file, which is 2021-12-02 19:42:01+01:00.
What happens if you did this with a completely different timezone with either a bigger positive offset (like +3:00, Africa/Nairobi for example), or a negative offset (like -8:00, America/Los_Angeles for example)?
Hi,
Sorry for my late reply but I took some time to perform some tests on an EC2 instance as well, but first things first.
The output of timedatectl on Amazon Linux 2 is exactly like yours, while on Ubuntu and Debian 11 looks like this:
aaversa@debian:~/test_aws$ timedatectl
Local time: Fri 2021-12-10 15:48:31 GMT
Universal time: Fri 2021-12-10 15:48:31 UTC
RTC time: Fri 2021-12-10 15:48:31
Time zone: Europe/Dublin (GMT, +0000)
System clock synchronized: no
NTP service: inactive
RTC in local TZ: no
Now, to the interesting part :) The issue is not replicable on Amazon Linux 2. I tried all possible combinations of:
- files created on different timezones and then copied to S3;
- aws-cli v1 installed in three different ways (via package manager, via pip3 -- as outlined here https://docs.aws.amazon.com/cli/v1/userguide/install-linux-al2017.html -- and also with the install script contained in https://s3.amazonaws.com/aws-cli/awscli-bundle.zip).
I tried the same combinations on both my native Ubuntu box (16.04) and on WSL2 (20.04): the issue is always there, no matter how aws-cli v1 is deployed. Furthermore: I was always using the same bucket / files for all tests across all different machines.
At this point I wanted to exclude a possible issue related to the distro, so I tried with a fresh Debian 11 instance (11.1.0).
I installed aws-cli via pip3:
aws-cli/1.22.23 Python/3.9.2 Linux/5.10.0-9-amd64 botocore/1.23.23
In this case the issue is present straight out of the box.
The Test
timedatectl set-timezone Europe/Dublin
date > test_file
aws cp test_file s3://<my_bucket>/test_file
aws cp s3://<my_bucket>/test_file ./test_file_from_s3
stat test_file*
stat shows that Modify entry is one hour in advance for test_file_from_s3.
Hope that this will help you to pin point the issue. Have a great weekend.
Thanks for the detailed post, @avemar. I'll switch to another distribution and see if I can reproduce!
Hi @avemar,
Just an update that I was able to reproduce using an AWS EC2 AMI for Ubuntu 18.04. Doing some more investigation.
Just to confirm what I see, I could reproduce this without any round trip. Doing an aws s3 ls also shows the incorrect time.
timedatectl set-timezone Europe/Dublin
date > test_file
aws cp test_file s3://<my_bucket>/test_file
aws s3 ls s3://<my_bucket>/test_file
When I copied it to S3, the response back I got was:
2022-01-20 00:49:27,023 - ThreadPoolExecutor-0_0 - botocore.parsers - DEBUG - Response headers: {'x-amz-id-2': '9OCinnfOPHyX9eSz8sMZQJ8mqRUMgQl8fxFvEXZG/8NKB6PSRf8225eSG5d4+SxNJe1AEsAPwRY=', 'x-amz-request-id': 'WE49F4TJVC5AE9YY', 'Date': 'Thu, 20 Jan 2022 00:49:27 GMT', 'ETag': '"75bf222df4050f5f80bcc4af16be2b10"', 'Server': 'AmazonS3', 'Content-Length': '0'}
So the date there should be Thu, 20 Jan 2022 00:49:27 GMT.
When I ls the remote file, I get:
2022-01-20 01:49:27 29 test_file
Which shows one hour added, but the ls response body shows:
b'<?xml version="1.0" encoding="UTF-8"?>\n<ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Name>my-bucket</Name><Prefix>test_file</Prefix><KeyCount>1</KeyCount><MaxKeys>1000</MaxKeys><Delimiter>/</Delimiter><EncodingType>url</EncodingType><IsTruncated>false</IsTruncated><Contents><Key>test_file</Key><LastModified>2022-01-20T00:49:27.000Z</LastModified><ETag>"75bf222df4050f5f80bcc4af16be2b10"</ETag><Size>29</Size><StorageClass>STANDARD</StorageClass></Contents></ListBucketResult>'
The LastModified is:
<LastModified>2022-01-20T00:49:27.000Z</LastModified>
Hi @kdaily ,
Thanks for the detailed update.