aws-cli icon indicating copy to clipboard operation
aws-cli copied to clipboard

Wrong local file timestamp when using Europe/Dublin host timezone

Open avemar opened this issue 4 years ago • 9 comments

Confirm by changing [ ] to [x] below to ensure that it's a bug:

Describe the bug Files copied from S3 to local (either with cp or sync) have wrong local timestamp (1 extra hour) in system with Europe/Dublin timezone set. Issue presents itself only when daylight saving time is not observed.

Platform/OS/Hardware/Device aws-cli/1.18.114 Python/2.7.12 Linux/4.15.0-142-generic botocore/1.17.37/ Ubuntu 16.04 aws-cli/1.21.9 Python/3.8.10 Linux/5.4.72-microsoft-standard-WSL2 botocore/1.22.9 WSL2 Ubuntu 20.04

To Reproduce (observed behavior) Set local timezone to Europe/Dublin and copy or sync file(s) from S3 to local.

Expected behavior Timestamp of copied files should be the same as stored objects on S3, because Dublin timezone currently uses GMT, which has no offset compared to UTC.

Logs/output When using sync, relevant file contents from S3: <Contents><Key>my_file</Key><LastModified>2021-12-02T17:42:01.000Z</LastModified>

Sync command trying to sync the file again, due to different timestamp detected: 2021-12-03 13:24:01,986 - MainThread - awscli.customizations.s3.syncstrategy.base - DEBUG - syncing: my_bucket/my_file -> /home/aaversa/Documents/test_s3_sync/my_file, size: 3847 -> 3847, modified time: 2021-12-02 18:42:01+01:00 -> 2021-12-02 19:42:01+01:00

Created local file has timestamp 18:42:01 instead of expected 17:42:01.

Additional context When using Europe/London timezone the issue doesn't appear.

avemar avatar Dec 06 '21 08:12 avemar

I think the Ireland (Eire) (reverse?) definition of "standard time" being winter at GMT and summer being UT +01 instead of the usual summer being "standard time" at UT +01 and winter DST being GMT confuses several timezone libraries that don't expect a negative SAVE offset, see:

  1. https://github.com/eggert/tz/blob/main/europe#L341-L344
  2. https://github.com/eggert/tz/blob/main/europe#L512-L557

jnerin avatar Dec 07 '21 08:12 jnerin

Thanks for the post, @avemar, and sorry to hear you're having trouble.

Also, thanks for the info @jnerin!

I'm having trouble reproducing. I'm doing the following, and copy and sync set the correct time on an Amazon Linux 2 EC2 as well as my Mac:

> export TZ=Europe/Dublin
> aws s3 sync s3://my-bucket/testobjects/ /tmp/

I'm wondering about this statement:

Issue presents itself only when daylight saving time is observed.

Does that mean when you performed the sync now, or when you initially put the object in S3? Dublin is not currently under DST, correct?

kdaily avatar Dec 07 '21 20:12 kdaily

Thank you for taking the time @kdaily .

Issue presents itself only when daylight saving time is observed.

I meant exactly the opposite, my bad. I updated the original comment in order to reflect this. The issue started only as soon as Ireland moved from DST to regular (winter) GMT. So object created during GMT (with correct timestamp stored in S3 as UTC), but sync and cp acting up only if local machine has Europe/Dublin timezone set and it's not DST. You're correct, Dublin is not on DST currently.

Also, not sure if it's relevant, but on Ubuntu I used timedatectl in order to perform tests: timedatectl set-timezone Europe/Dublin.

avemar avatar Dec 08 '21 07:12 avemar

Thanks for clarifying! I switched to using timedatectl but on an Amazon Linux 2 instance for now. Here's what mine looks like right now:

[ec2-user@ip-172-31-13-0 ~]$ sudo timedatectl set-timezone Europe/Dublin
[ec2-user@ip-172-31-13-0 ~]$ timedatectl
      Local time: Thu 2021-12-09 00:34:07 GMT
  Universal time: Thu 2021-12-09 00:34:07 UTC
        RTC time: Thu 2021-12-09 00:34:07
       Time zone: Europe/Dublin (GMT, +0000)
     NTP enabled: yes
NTP synchronized: yes
 RTC in local TZ: no
      DST active: no
 Last DST change: DST ended at
                  Sun 2021-10-31 01:59:59 IST
                  Sun 2021-10-31 01:00:00 GMT
 Next DST change: DST begins (the clock jumps one hour forward) at
                  Sun 2022-03-27 00:59:59 GMT
                  Sun 2022-03-27 02:00:00 IST

Can you show me what yours is like?

I think the debug line hints at the issue: the dates are shown there with an offset of +1:00, and this is what the modified time is getting set to:

2021-12-03 13:24:01,986 - MainThread - awscli.customizations.s3.syncstrategy.base - DEBUG - syncing: my_bucket/my_file -> /home/aaversa/Documents/test_s3_sync/my_file, size: 3847 -> 3847, modified time: 2021-12-02 18:42:01+01:00 -> 2021-12-02 19:42:01+01:00

This shows that the LastModified time of the S3 object in your system's timezone is 2021-12-02 18:42:01+01:00, and it's going to sync it because it's older than the modified file time of the local file, which is 2021-12-02 19:42:01+01:00.

What happens if you did this with a completely different timezone with either a bigger positive offset (like +3:00, Africa/Nairobi for example), or a negative offset (like -8:00, America/Los_Angeles for example)?

kdaily avatar Dec 09 '21 00:12 kdaily

Hi, Sorry for my late reply but I took some time to perform some tests on an EC2 instance as well, but first things first. The output of timedatectl on Amazon Linux 2 is exactly like yours, while on Ubuntu and Debian 11 looks like this:

aaversa@debian:~/test_aws$ timedatectl
               Local time: Fri 2021-12-10 15:48:31 GMT
           Universal time: Fri 2021-12-10 15:48:31 UTC
                 RTC time: Fri 2021-12-10 15:48:31
                Time zone: Europe/Dublin (GMT, +0000)
System clock synchronized: no
              NTP service: inactive
          RTC in local TZ: no

Now, to the interesting part :) The issue is not replicable on Amazon Linux 2. I tried all possible combinations of:

  • files created on different timezones and then copied to S3;
  • aws-cli v1 installed in three different ways (via package manager, via pip3 -- as outlined here https://docs.aws.amazon.com/cli/v1/userguide/install-linux-al2017.html -- and also with the install script contained in https://s3.amazonaws.com/aws-cli/awscli-bundle.zip).

I tried the same combinations on both my native Ubuntu box (16.04) and on WSL2 (20.04): the issue is always there, no matter how aws-cli v1 is deployed. Furthermore: I was always using the same bucket / files for all tests across all different machines.

At this point I wanted to exclude a possible issue related to the distro, so I tried with a fresh Debian 11 instance (11.1.0). I installed aws-cli via pip3: aws-cli/1.22.23 Python/3.9.2 Linux/5.10.0-9-amd64 botocore/1.23.23

In this case the issue is present straight out of the box.

The Test

timedatectl set-timezone Europe/Dublin
date > test_file
aws cp test_file s3://<my_bucket>/test_file
aws cp s3://<my_bucket>/test_file ./test_file_from_s3
stat test_file*

stat shows that Modify entry is one hour in advance for test_file_from_s3.

Hope that this will help you to pin point the issue. Have a great weekend.

avemar avatar Dec 10 '21 16:12 avemar

Thanks for the detailed post, @avemar. I'll switch to another distribution and see if I can reproduce!

kdaily avatar Dec 10 '21 16:12 kdaily

Hi @avemar,

Just an update that I was able to reproduce using an AWS EC2 AMI for Ubuntu 18.04. Doing some more investigation.

kdaily avatar Jan 20 '22 00:01 kdaily

Just to confirm what I see, I could reproduce this without any round trip. Doing an aws s3 ls also shows the incorrect time.

timedatectl set-timezone Europe/Dublin
date > test_file
aws cp test_file s3://<my_bucket>/test_file
aws s3 ls s3://<my_bucket>/test_file

When I copied it to S3, the response back I got was:

2022-01-20 00:49:27,023 - ThreadPoolExecutor-0_0 - botocore.parsers - DEBUG - Response headers: {'x-amz-id-2': '9OCinnfOPHyX9eSz8sMZQJ8mqRUMgQl8fxFvEXZG/8NKB6PSRf8225eSG5d4+SxNJe1AEsAPwRY=', 'x-amz-request-id': 'WE49F4TJVC5AE9YY', 'Date': 'Thu, 20 Jan 2022 00:49:27 GMT', 'ETag': '"75bf222df4050f5f80bcc4af16be2b10"', 'Server': 'AmazonS3', 'Content-Length': '0'}

So the date there should be Thu, 20 Jan 2022 00:49:27 GMT.

When I ls the remote file, I get:

2022-01-20 01:49:27         29 test_file

Which shows one hour added, but the ls response body shows:

b'<?xml version="1.0" encoding="UTF-8"?>\n<ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Name>my-bucket</Name><Prefix>test_file</Prefix><KeyCount>1</KeyCount><MaxKeys>1000</MaxKeys><Delimiter>/</Delimiter><EncodingType>url</EncodingType><IsTruncated>false</IsTruncated><Contents><Key>test_file</Key><LastModified>2022-01-20T00:49:27.000Z</LastModified><ETag>&quot;75bf222df4050f5f80bcc4af16be2b10&quot;</ETag><Size>29</Size><StorageClass>STANDARD</StorageClass></Contents></ListBucketResult>'

The LastModified is:

<LastModified>2022-01-20T00:49:27.000Z</LastModified>

kdaily avatar Jan 20 '22 00:01 kdaily

Hi @kdaily ,

Thanks for the detailed update.

avemar avatar Jan 25 '22 15:01 avemar