tutorbook icon indicating copy to clipboard operation
tutorbook copied to clipboard

Add anonymous email relay

Open nicholaschiang opened this issue 5 years ago • 3 comments
trafficstars

Is your feature request related to a problem? Please describe.

Students and tutors have no way of direct back-and-forth communication. We have yet to build any sort of in-app chat so the only option they have available right now is the Bramble link that we send out once the request received parental approval.

Depending 100% on Bramble (and forcing all of our matches to use the one platform) is far from ideal. I'm guessing that, in reality, the Bramble room would only be used for one purpose: to exchange direct contact information to setup a meeting using more ideal video conferencing software (e.g. Zoom or Google Meet or whatever is ideal for them).

Describe the solution you'd like

Instead of forcing our users to work around our system (and thus relinquish the privacy they get from not exchanging direct contact information), we should implement a system like Craigslist's or Ebay's buyer-to-seller email back-and-forth.

It'll be completely anonymous and each attendee in a request and/or appointment will have their own unique anonymous TB email address (e.g. [email protected]). The tutor and student (and parents) can then email back-and-forth while not divulging any direct contact information.

And we'll be able to intercept all those email communications with our SMTP server; those emails can then be viewed by organization admins and/or the user's themselves in a chat history of sorts.

Describe alternatives you've considered

We could build our own in-app chat (which is something we'll probably want to look into in the future), but for now, email is well-known and widely used (and provides a good archive).

We'll definitely want to build our own in-app video conferencing solution in the future as well, so this might be disabled or restricted (so as to force users to use the in-app video conferencing that we build). But for now, this seems like the best, easiest to implement solution. We'll be the Craigslist of tutoring/mentoring!

Additional context

I've been looking into pre-built solutions like:

  • Mailcare (https://gitlab.com/mailcare/mailcare)
  • ForwardEmail (https://github.com/forwardemail/free-email-forwarding)
  • AnonAddy (https://github.com/anonaddy/anonaddy)

But, I really want to go serverless (and I definitely don't want to have to manage my own email servers), so we'll probably want to follow this tutorial and use the AWS SES + Lambda setup they describe:

Diagram of the AWS email relay setup

  1. The company's mail server will perform an MX record lookup for the project.com domain.
  2. The company's mail server will transmit the verification email to the Amazon SES SMTP server specified in the MX record.
  3. Amazon SES will store the email in MIME format in an S3 bucket.
  4. SES will invoke a Lambda function.
  5. The Lambda function will begin executing and read in the MIME email from the S3 bucket.
  6. After processing the email the Lambda function will pass the email to SES to send.
  7. SES will perform an MX record lookup for yourdomain.com and then transmit the email to the specified mail server. At this point you can now retrieve the email from the [email protected] mailbox using your preferred mail client.

nicholaschiang avatar Jul 06 '20 22:07 nicholaschiang

Ok, our current setup (implemented in aws/src/index.ts):

  1. Intercepts incoming emails (to @mail.tutorbook.org addresses) with an AWS Lambda function.
  2. Replaces the From and Reply-To headers with the user's anonymous email address (their <uid>@mail.tutorbook.org address). Note that this fails if the sender doesn't have an account on Tutorbook.
  3. Replaces the To and the email's recipients with their actual email addresses. Note that this fails if the email recipients don't have accounts on Tutorbook (i.e. the <uid> in the anonymized email doesn't exist).
  4. Forwards the now-transformed email to it's intended recipients using SES.sendRawEmail.

But, there are some key limitations to that strategy:

  • This fails to anonymize anyone who doesn't already have an account on Tutorbook.
  • This does not associate an email thread with a request or an appointment. Ideally, it would associate the email thread with a request so that we could show a "communications history" for each request.
  • Those emails are no longer 100% anonymous (as you're able to associate one user with multiple lesson requests because their "anonymous" email address never changes; Craigslist addresses this by creating a new anonymous email for every sale).
  • This system doesn't account for all the possible email header combinations (see the below table).

Incoming emails:

Header Possibilities
From real
Reply-To real
To real, anonymous
BCC real, anonymous
CC real, anonymous

To address those limitations, I propose this setup:

  1. Anonymous email addresses are formatted like <uid>-<appt/request id>@mail.tutorbook.org (i.e. each email is unique to a request or an appointment). Note that this still does not completely address the anonymity issue (but I don't think that's a big concern as we're not trying to be 100% anonymous anyways; we just don't want to share direct contact information).
  2. Convert all real addresses (that could be included in any header; see table above) to anonymous addresses (creating new users and adding them as attendees to the appointment as necessary).
  3. For each anonymous destination/recipient (e.g. To, BCC, CC), convert that one anonymous address to the original (real) address and forward it to the original address.
  4. Add the email data to the appointment's emails subcollection (to track communications).

nicholaschiang avatar Jul 09 '20 17:07 nicholaschiang

Another issue I forgot to mention above is that AWS Lambda functions can be invoked multiple times; the queue is eventually consistent (see their docs for more info):

Even if your function doesn't return an error, it's possible for it to receive the same event from Lambda multiple times because the queue itself is eventually consistent. If the function can't keep up with incoming events, events might also be deleted from the queue without being sent to the function. Ensure that your function code gracefully handles duplicate events, and that you have enough concurrency available to handle all invocations.

Thus, we have to store the emails in some database somewhere (i.e. the appointment's Firestore emails collection) so as to prevent sending duplicate emails.

nicholaschiang avatar Jul 09 '20 17:07 nicholaschiang

Reopening this (it was temporarily taken from production due to email relay flakiness) because I want to:

  • [ ] Refactor the aws/src/index.ts such that each function is unit-testable (e.g. separate email parsing from re-sending).
  • [ ] Add unit tests to our Cypress test suite (probably under the cypress/tests/api directory, but that could be subject to change).
  • [ ] Add some sort of integration tests using an AWS local emulator and stubbed network requests/responses (this might not be worth the effort, but if there's some easy way to test SMTP signals, this'll be worth a shot).
  • [ ] Add a continuous monitor on our production deployment (similar to the website monitor badge currently displayed on the README) to ensure that the email relay is always working smoothly.

All of this is just aimed at making our email relay system 100% reliable (i.e. like Craigslist's).

nicholaschiang avatar Sep 16 '20 22:09 nicholaschiang