letters icon indicating copy to clipboard operation
letters copied to clipboard

Letters, or how to parse emails in Go

Letters, or how to parse emails in Go

Test Go Report Card

Letters is a minimalistic Golang library for parsing plaintext and MIME emails.

It correctly handles text and MIME mime-types, Base64 and Quoted-Printable Content-Transfer-Encoding, as well as any text encoding that Golang standard library is capable of handling. Letters will parse an email into a simple struct with standard headers and text, enriched text, and HTML content, and decode inline and attached files.

Quickstart

Install

go get github.com/mnako/[email protected]

Parse a raw email from a Reader:

email, err := letters.ParseEmail(r)
if err != nil {
    log.Fatal(err)
}

and you can access the common headers:

email.Headers.Sender
// mail.Address{Name: "Alice Sender", Address: "[email protected]"}

email.Headers.From
// []mail.Address{
//  {Name: "Alice Sender", Address: "[email protected]"}, 
//  {Name: "Alice Sender", Address: "[email protected]"},
// }

email.Headers.Subject
// "???ァ Test English Pangrams"

email.Headers.To
// []mail.Address{
//  {Name: "Bob Recipient", Address: "[email protected]"}, 
//  {Name: "Carol Recipient", Address: "[email protected]"},
// }

email.Headers.Cc
// []mail.Address{
//  {Name: "Dan Recipient", Address: "[email protected]"}, 
//  {Name: "Eve Recipient", Address: "[email protected]"},
// }

email.Headers.Bcc
// []mail.Address{
//  {Name: "Frank Recipient", Address: "[email protected]"}, 
//  {Name: "Grace Recipient", Address: "[email protected]"},
// }

get custom headers:

email.Headers.ExtraHeaders
// map[string][]string{
//    "X-Clacks-Overhead": {"GNU Terry Pratchett"},
// }

get decoded bodies:

email.Text
// "The quick brown fox jumps over a lazy dog..."

email.HTML
// "<html><div dir="ltr"><p>The quick brown fox jumps over a lazy dog..."

inline files:

email.InlineFiles
// []InlineFile{
//    {
//        ContentType: ContentTypeHeader{
//            ContentType: "image/jpeg",
//            Params: map[string]string{
//                "name": "inline-jpg-image-without-disposition.jpg",
//            },
//        },
//        ContentDisposition: ContentDispositionHeader{
//            ContentDisposition: "",
//            Params:             map[string]string(nil),
//        },
//        Data: []byte{255, ...},
//    },
//    {
//        ContentID: "[email protected]",
//        ContentType: ContentTypeHeader{
//            ContentType: "image/jpeg",
//            Params: map[string]string{
//                "name": "inline-jpg-image-name.jpg",
//            },
//        },
//        ContentDisposition: ContentDispositionHeader{
//            ContentDisposition: inline,
//            Params: map[string]string{
//                "filename": "inline-jpg-image-filename.jpg",
//            },
//        },
//        Data: []byte{255, ...},
//    },
// }

and attached files:

email.AttachedFiles
// []AttachedFile{
//    {
//        ContentType: ContentTypeHeader{
//            ContentType: "application/pdf",
//            Params: map[string]string{
//                "name": "attached-pdf-name.pdf",
//            },
//        },
//        ContentDisposition: ContentDispositionHeader{
//            ContentDisposition: attachment,
//            Params: map[string]string{
//                "filename": "attached-pdf-filename.pdf",
//            },
//        },
//        Data: []byte{37, ...},
//    },
//    {
//        ContentType: ContentTypeHeader{
//            ContentType: "application/pdf",
//            Params: map[string]string{
//                "name": "attached-pdf-without-disposition.pdf",
//            },
//        },
//        ContentDisposition: ContentDispositionHeader{
//            ContentDisposition: "",
//            Params:             map[string]string(nil),
//        },
//        Data: []byte{37, ...},
// }

The same parser and methods will work for other languages, text encodings, and transfer-encodings:

r := strings.NewReader(```Subject: =?ISO-2022-JP?Q?=1B=24=42=24=24=24=6D=24=4F=32=4E=1B=28=42?=
Content-Type: text/plain; charset=ISO-2022-JP


=1B$B?'$OFw$($I=1B(B
=1B$B;6$j$L$k$r=1B(B```)

email, _ := letters.ParseEmail(r)

email.Headers.Subject
// "????????ッ?ュ?"

email.Text
// "??イ??ッ????????ゥ??」?????ャ??????..."

Current Scope and Features

  • Parsing plaintext emails and recursively traversing multipart (multipart/alternative, multipart/mixed, multipart/parallel, multipart/related, multipart/signed) emails
  • Unfolding headers
  • Decoding non-US-ASCII email headers according to RFC 2047
  • Decoding Base64 and Quoted-Printable Content-Transfer-Encodings
  • Decoding any text encoding (e.g. UTF-8, Chinese GB18030 or GBK, Finnish ISO-8859-15, Icelandic ISO-8859-1, Japanese ISO-2022-JP, Korean EUC-KR, Polish ISO-8859-2) in combination with any Transfer Encoding (e.g. ASCII-over-7bit, UTF-8-over-Base64, Japanese ISO-2022-JP-over-7bit, Polish ISO-8859-2-over-Quoted-Printable, etc.)
  • Easy access to text, enriched text and HTML content of the email
  • Easy access to inline attachments
  • Easy access to attached files

All of that and more in a minimal Golang library with realistic email examples and thorough test coverage.

Current Limitations

  • S/MIME multipart/signed email are limited to clear-signed messages
  • The decryption and signature verification and any other cryptography-related tasks need to be performed outside of letters.

Current Status

Feature-complete and tests passing.

Currently, gathering feedback and refactoring code before releasing v1.0.0. Fields and API are still subject to change.