wagtail-storages
wagtail-storages copied to clipboard
Use AWS S3 with private documents in Wagtail
.. image:: https://github.com/torchbox/wagtail-storages/raw/main/logo.png
wagtail-storages
.. image:: https://img.shields.io/pypi/v/wagtail-storages.svg :target: https://pypi.org/project/wagtail-storages/ .. image:: https://img.shields.io/pypi/dm/wagtail-storages.svg :target: https://pypi.org/project/wagtail-storages/ .. image:: https://travis-ci.org/torchbox/wagtail-storages.svg?branch=master :target: https://travis-ci.org/torchbox/wagtail-storages
This package fills the missing gap in using AWS S3 together with Wagtail. This package will be useful if you want to:
-
Use AWS S3 bucket for hosting Wagtail documents.
-
Put the bucket behind the CDN so that the bucket is not called directly each time.
-
Allow editors to use privacy controls on documents, whilst using CDN.
-
Avoid time-outs because of downloads being proxied through Wagtail views.
Note: you cannot use the document
redirect view__ if you want your documents to be truly private.
.. WagtailRedirectView: https://docs.wagtail.io/en/stable/advanced_topics/settings.html#wagtaildocs-serve-method __ WagtailRedirectView
What does it do?
The package is a collection of signal handlers and Wagtail hooks.
- Sets per-object ACLs on S3 whenever privacy settings change on a Wagtail document.
- Replaces the current document view with a redirect. Either to a signed S3 bucket URL for private documents or public custom domain URL for public ones.
- Purges CDN for documents that have changed.
Requirements
django-storageswith theS3Boto3Storagestorage backend configured in a Wagtail project.- CDN supported by Wagtail front-end cache invalidator.
Management commands
django-admin fix_document_acls
The package provider a management command that sets all the documents' ACLs
according to the their collection permissions. This must be called if there had
been documents in a bucket before the package was used to make sure the ACLs in
the bucket are correct.
Settings
--------
WAGTAIL_STORAGES_DOCUMENTS_FRONTENDCACHE
Using the same format as Wagtail's WAGTAILFRONTENDCACHE setting, but to be
only used by the wagtail-storages to purge the documents. If not set, the purge
won't happen. Read more on how to format it in the Wagtail docs <https://docs.wagtail.io/en/stable/reference/contrib/frontendcache.html>_,
e.g.
.. code:: python
WAGTAIL_STORAGES_DOCUMENTS_FRONTENDCACHE = { 'cloudfront': { 'BACKEND': 'wagtail.contrib.frontend_cache.backends.CloudfrontBackend', 'DISTRIBUTION_ID': 'your-distribution-id', }, }
WAGTAIL_STORAGES_DOCUMENT_HOOK_ORDER
Set a custom order for the document hook order. It's set to 100 by default.
It's important that it runs after any of your hooks since it returns a
response, e.g.
.. code:: python
WAGTAIL_STORAGES_DOCUMENT_HOOK_ORDER = 900
Recommended S3 setup with Wagtail
---------------------------------
The following guide explains the recommended setup for using S3 with Wagtail.
This guide assumes that:
* You serve your main website at ``llamasavers.com`` (replace
``llamasavers.com`` with your actual domain name).
* Your S3 bucket is called ``media.llamasavers.com`` and you host it from that
domain name.
* You are using CDN on that domain, this guide will assume Cloudflare.
Set up S3 bucket
~~~~~~~~~~~~~~~~
First, set up your S3 bucket. It must be configured to:
- Have a name that matches the domain name, e.g. ``media.llamasavers.com``.
- Allow the user to perform the following actions on the bucket:
- ``s3:ListBucket``
- ``s3:GetBucketLocation``
- ``s3:ListBucketMultipartUploads``
- ``s3:ListBucketVersions``
- Allow the user to perform all the actions (``s3:*``) on the objects within the
bucket.
- Allow the internet traffic to access Wagtail image renditions (``images/*``).
- Allow use of public ACLs, by disabling:
- "Block public access to buckets and objects granted through new access control lists (ACLs)"
- "Block public access to buckets and objects granted through any access control lists (ACLs)"
The user permissions can be set in the IAM or via a bucket policy. See example
of all of those points being achieved in the bucket policy below.
.. code:: json
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "PublicGetObject",
"Effect": "Allow",
"Principal": "*",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::[BUCKET NAME]/images/*"
},
{
"Sid": "AllowUserManageBucket",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::[USER ARN]"
},
"Action": [
"s3:ListBucket",
"s3:GetBucketLocation",
"s3:ListBucketMultipartUploads",
"s3:ListBucketVersions"
],
"Resource": "arn:aws:s3:::[BUCKET NAME]"
},
{
"Sid": "AllowUserManageBucketObjects",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::[USER ARN]"
},
"Action": "s3:*",
"Resource": "arn:aws:s3:::[BUCKET NAME]/*"
}
]
}
After the S3 bucket is set up on AWS, you can configure the Wagtail project to
use it.
Set up django-storages
~~~~~~~~~~~~~~~~~~~~~~
Install ``django-storages`` and ``boto3``.
.. code:: sh
pip install django-storages[boto3]
Set up your S3 bucket with ``django-storages``. The following code allows
configuration via environment variables.
.. code:: python
# settings.py
import os
if "AWS_STORAGE_BUCKET_NAME" in os.environ:
# Add django-storages to the installed apps
INSTALLED_APPS = INSTALLED_APPS + ["storages"]
# https://docs.djangoproject.com/en/stable/ref/settings/#default-file-storage
DEFAULT_FILE_STORAGE = "storages.backends.s3boto3.S3Boto3Storage"
AWS_STORAGE_BUCKET_NAME = os.environ["AWS_STORAGE_BUCKET_NAME"]
# Disables signing of the S3 objects' URLs. When set to True it
# will append authorization querystring to each URL.
AWS_QUERYSTRING_AUTH = False
# Do not allow overriding files on S3 as per Wagtail docs recommendation:
# https://docs.wagtail.io/en/stable/advanced_topics/deploying.html#cloud-storage
# Not having this setting may have consequences such as losing files.
AWS_S3_FILE_OVERWRITE = False
# Default ACL for new files should be "private" - not accessible to the
# public. Images should be made available to public via the bucket policy,
# where the documents should use wagtail-storages.
AWS_DEFAULT_ACL = "private"
# We generally use this setting in production to put the S3 bucket
# behind a CDN using a custom domain, e.g. media.llamasavers.com.
# https://django-storages.readthedocs.io/en/latest/backends/amazon-S3.html#cloudfront
if "AWS_S3_CUSTOM_DOMAIN" in os.environ:
AWS_S3_CUSTOM_DOMAIN = os.environ["AWS_S3_CUSTOM_DOMAIN"]
# When signing URLs is enabled, the region must be set.
# The global S3 endpoint does not seem to support signed URLS.
# Set this only if you will be using signed URLs.
if "AWS_S3_REGION_NAME" in os.environ:
AWS_S3_REGION_NAME = os.environ["AWS_S3_REGION_NAME"]
# This settings lets you force using http or https protocol when generating
# the URLs to the files. Set https as default.
# https://github.com/jschneier/django-storages/blob/10d1929de5e0318dbd63d715db4bebc9a42257b5/storages/backends/s3boto3.py#L217
AWS_S3_URL_PROTOCOL = os.environ.get("AWS_S3_URL_PROTOCOL", "https:")
If you use the above snippet, you can set the following environment variables:
* ``AWS_STORAGE_BUCKET_NAME`` - set to ``media.llamasavers.com``.
* ``AWS_S3_CUSTOM_DOMAIN`` - set to ``media.llamasavers.com``.
* ``AWS_S3_REGION_NAME`` - set to your AWS region name, e.g. ``eu-west-2``.
You can use one of the methods to provide `boto3 with credentials`__. We
suggest you stick with the environment variables. To do that, you need to set
the following variables:
* ``AWS_ACCESS_KEY_ID``
* ``AWS_SECRET_ACCESS_KEY``
.. _Boto3Credentials: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html
__ Boto3Credentials_
Now the storage should be configured and working. Editors should be able to
upload images and documents in Wagtail admin.
Set up ``wagtail-storages``
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Install ``wagtail-storages`` itself.
.. code:: sh
pip install wagtail-storages
Add ``wagtail_storages`` to your ``INSTALLED_APPS`` in your settings file.
.. code:: python
# settings.py
INSTALLED_APPS = [
# ... Other apps
"wagtail_storages.apps.WagtailStoragesConfig",
# ... Other apps
]
With that, ACLs should be updated if documents are moved to
private collections.
If you already have files in your S3 bucket, run ``django-admin
fix_document_acls`` to make sure all documents have the right ACLs set up.
Set up front-end cache invalidation
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If edge cache is set up on the custom domain (``media.llamasavers.com``) you
should set up the CDN purging to avoid having outdated or private documents
available to users via the CDN endpoint. For example, for Cloudflare you want
to use a configuration similar to the one below:
.. code:: python
# settings.py
import os
if "S3_CACHE_CLOUDFLARE_TOKEN" in os.environ:
WAGTAIL_STORAGES_DOCUMENTS_FRONTENDCACHE = {
"default": {
"BACKEND": "wagtail.contrib.frontend_cache.backends.CloudflareBackend",
"EMAIL": os.environ["S3_CACHE_CLOUDFLARE_EMAIL"],
"TOKEN": os.environ["S3_CACHE_CLOUDFLARE_TOKEN"],
"ZONEID": os.environ["S3_CACHE_CLOUDFLARE_ZONEID"],
},
}
Then set the following environment variables:
* ``S3_CACHE_CLOUDFLARE_EMAIL``
* ``S3_CACHE_CLOUDFLARE_TOKEN``
* ``S3_CACHE_CLOUDFLARE_ZONEID``
Once set up, the documents will be purged from cache when they are
modified or their privacy settings have changed.
The setting follows configuration format of the front-end cache invalidator
configuration in Wagtail. See the details `here`__. The only difference is
the setting name, which for wagtail-storages is
``WAGTAIL_STORAGES_DOCUMENTS_FRONTENDCACHE``.
.. _WagtailFrontEndCache: https://docs.wagtail.io/en/stable/reference/contrib/frontendcache.html
__ WagtailFrontEndCache_
All done!