alaveteli
alaveteli copied to clipboard
Make decoded but unredacted version of incoming email available via admin interface
Sometimes administrators need to access the unredacted plain text of an incoming email body for example to provide users with access to email addresses or phone numbers which have been removed by Alaveteli.
Raw emails, which are available via the admin interface, sometimes contain the body of the email in Base64 rather than plain text.
I can't imagine why admins would need " Cached main body text folded " or " Cached main body text unfolded" in relation to incoming messages. Perhaps one of these could be replaced with "main body text unredacted" ?
See also #10 which is related.
This is something we have to do pretty often for various reasons; predominantly to release contact details at the user's request, but also for other reasons. It is a bar to new administrators being able to undertake the work needed and so it would be great if this could be made easier.
Context
Admins often want to view the contents of an incoming message without any masking or censor rules applied.
In some cases you can inspect the raw email, but it's often base64 encoded, which means needing to download the email and open it in a mail client. Even when the email is not base64 encoded, it's often difficult to mentally parse the information out when there are lots of headers, attachment sections and HTML body content.
To fix this we'll render the main body text of an incoming message into the admin interface, but without any redactions applied.
UI
We can just render the unredacted body out under the attachments listing on the incoming message admin page. Ideally we'll allow folding/unfolding of quoted sections, but not essential if it adds a load of complexity.
Mechanics
We'll want to use cached content to avoid fetching the raw email from Active Storage, but there's huge gotcha in that we currently cache the main body text with masks.
Given we apply censor rules on the fly, I don't see why we don't also do that with the text masks (which are effectively hard-coded censor rules anyway).
Of course, we'd then need to ensure that the masks get applied where we do show text publicly. It looks like we do call the cached_main_body_text_*
methods in a few places, so they'll need to be updated to call an alternative method.
Other considerations
Re-caching all incoming messages doesn't seem like a useful idea, so I think we can take the position that this will only become possible for new messages, or admins can ask for developer assistance to re-parse old messages on request. Running apply_masks
to cached masked content shouldn't be problematic.
I don't think this poses any data retention policy issues, since we already hold and render the unredacted data in the raw email. We will want to make this change clear in the upgrade notes though.
base64 encoded, which means needing to download the email and open it in a mail client.
Not important but just to note mySociety maintains a Base64 decoder service for admins to use for manually decoding emails. https://wdtkwiki.mysociety.org/wiki/Support_mailbox#Respond_to_requests_for_email_addresses_in_responses
We'll want to use cached content to avoid fetching the raw email from Active Storage, but there's huge gotcha in that we currently cache the main body text with masks.
We now don't store unredacted/unmasked attachments so the only way to retrieve this would be to download the the raw email from ActiveStorage
. We have FoiAttachment#unmasked_body
which does this for us but we would need to look at possibly re-caching the raw email locally (possibly only temporarily) if we're expecting the admin to view the content of incoming messages from the admin UI more regularly.
Currently for Pro cases I deal with I would download the raw email to disk and open in the Apple's Mail.app.
In trying to figure out how to describe a version of this that doesn't result in frequent pulls from S3 I ended up having a brainwave that we could do https://github.com/mysociety/alaveteli/pull/7999.