Publii icon indicating copy to clipboard operation
Publii copied to clipboard

AWS S3 Synchronization

Open Linus-007 opened this issue 2 years ago • 4 comments

Does Publii synchronize files on S3 that changed or overwrite everything? I ask because a simple change to one page causes an extended upload of 200+ items. To save time on responses, I do not have versioned storage turned on in AWS S3.

I made one change, and looked at my S3 bucket after pushing that change. Every item in the bucket has a new date and time. Why upload everything rather than maintain a list of what has changed and only upload the changes? This can increase egress charges and put request charges. In five days, I have over 20,000 put requests for simple edits.

To save time on responses, I do not have versioned storage turned on in AWS S3.

Linus-007 avatar Dec 07 '21 05:12 Linus-007

In general, Publii only syncs files that have changed or are new. But it all depends on the changes you have made. For example, changing the post title, date, etc... changes the content of e.g. related posts. So, a small change must be reflected in most of the posts (files).

bobmitro avatar Dec 07 '21 06:12 bobmitro

It looks like this is the same as issue #377. files.publii.json is not being uploaded by Publii despite the fact that the IAM user has AmazonS3FullAccess. In the log viewer, deployment-process.log, I can see that the last file with checksums is not uploaded.

[Tue, 07 Dec 2021 06:46:21 GMT] UPL 404.html -> 404.html
[Tue, 07 Dec 2021 06:46:22 GMT] -> files.publii.json
[Tue, 07 Dec 2021 06:46:22 GMT] S3 ERROR: Access Denied

Regarding the S3 settings in the server configuration, None of the ALC settings allows files.publii.json to upload.

My S3 bucket is not public. I am using cloudfront and route53. My last successful files.publii.json upload was when I had the S3 bucket publicly exposed. I am opposed to this for security purposes. Users on the web should only be able to see the content from my website and not discover it in the bucket.

Linus-007 avatar Dec 07 '21 06:12 Linus-007

Oh yes, if the files.publii.json file is not uploaded, all files are sent to the server again.

bobmitro avatar Dec 07 '21 06:12 bobmitro

I do not think that #377 satisfactorily resolved the issue. I read the entire thread, examined the other issues. I think "authenticated-read" #62 contributes to the issue. If I set Publii to use ACL authenticated read for a bucket that is not public, for a test, the result is 100% failure. How about a different method to check files? Perhaps load the file to a different bucket, or allow it to be uploaded, but access restricted with a bucket policy for a single file?

I can see where the ALC is set in the s3.js. See File: app/back-end/modules/deploy/s3.js lines 195-201:

        let params = {
            ACL: 'authenticated-read',
            Body: fileContent,
            Bucket: this.bucket,
            Key: fileName,
            ContentType: mime.getType('json')
        };

Linus-007 avatar Dec 07 '21 07:12 Linus-007

The problem will be probably solved in v.0.41 (planned for mid-October) - since this version Publii will use the same ACL for all files, without predefined ACLs for publii.files.json file. Most probably the current issues are caused by fact that the used ACL are non-compatible with some buckets policies.

dziudek avatar Sep 29 '22 15:09 dziudek