Sending Email with Oracle, Part 6: MIME Content

Sending Email with Oracle

At this point in the series you should be able to send simple, plain text email with UTL_SMTP. Often though, we’ll want to send emails with special formatting – different fonts, highlighting, tables, images, etc. We’ll want to send attachments too.

It surprises many developers new to the UTL_SMTP package to find there is no API for formatting or attachments. It comes as an even greater surprise to them to find that SMTP itself, the underlying protocol for email doesn’t really support these features either!

Obviously though, it is possible to format email and attach files. We’ve all sent and received such emails. Furthermore, we’ve seen both features having limited support in the prior articles about APEX_MAIL and UTL_MAIL. So, what is the secret sauce? The answer is MIME (Multipurpose Internet Mail Extensions.)

We get some hints of the MIME usage by looking at the APEX_MAIL and UTL_MAIL routines. We can see they both have parameters for defining mime type for attachments. UTL_MAIL will also let you define mime types for the email body. APEX_MAIL does not let you directly specify the mime type for the body but it allows you to send bodies of text, html, or both. Thus allowing your email to be rendered with alternate types depending on the the client of the recipient. We’ll explore the option for sending alternative content in a later article.

MIME Blocks

The core of using MIME in email is the construction of blocks. Each block consists of a header describing the content of the block and then the content itself. The header defines the content type. Depending on the type, there may be additional fields such as character sets for text, file names for attachments, or encoding for multi-byte characters or binary content. If the content has multiple parts (such as text and attached images) then each block is delimited, with the boundary delimiter defined as part of the header.

The boundary delimiters have few rules but they must be unique within the context of an email body. They should be composed of 7-bit ASCII characters, not including control characters. While it is not strictly required, they are often limited to letters, digits, dashes (-) and equals signs (=). The dashes in particular are recommended because they are guaranteed to not be part of Base64-encoded values. Each time you begin a block you prefix the delimiter with two dashes (--). When you finish the last block within a particular scope, you suffix the delimiter with two dashes as well. The general pattern looking like this:

--boundary-delimter
1st mime block
--boundary-delimter
2nd mime block
--boundary-delimter
...
--boundary-delimter
last mime block
--boundary-delimter--

Examples of these delimiter usages are illustrated below in the sections on multipart content types. It is important to note the MIME boundary and header lines, like the SMTP fields must be terminated with CRLF, that is carriage-return and line-feed characters (ASCII characters 13 and 10 respectively.)

I generally keep the MIME construction separate from content of the mime blocks. That is, I usually pre-build the content within the blocks prior to sending the email; but write the MIME structure itself as I send the email. This is entirely a personal preference though as a mnemonic for myself. If you want to construct the entire email body as a CLOB value with both the MIME structure as well as the content, that is entirely permissible and may be a more beneficial to your way of thinking and designing your code.

Declaring MIME content

If you are going to use MIME you need to write that into the email header. First you declare the intention to use MIME by writing the version (which will always be 1.0) and then the type of content you will be including in the email body. For example…

MIME-Version: 1.0
Content-Type: text/html

Which you would create with…

UTL_SMTP.write_data(v_connection, 'MIME-Version: 1.0' || UTL_TCP.crlf);
UTL_SMTP.write_data(v_connection, 'Content-Type: text/html' || UTL_TCP.crlf);

Here are a few of the most common MIME content declarations:

Content-Type: text/plain

This is the default content type and as the name implies, is used for simple text-only messages. While it is the default, it is considered better practice to declare the content type anyway.

MIME-Version: 1.0
Content-Type: text/plain;

or, to be even more explicit, include the character set as well

MIME-Version: 1.0
Content-Type: text/plain;charset=us-ascii

If no character set is defined, us-ascii is the default value. For other character sets, in particular multi-byte character sets, you will need to specify the charset value and then encode the content. The most common method is base64 which encodes multiple bytes into values that will be supported by any valid SMTP version. For example:

MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64

my sample text

Content-Type: text/html

This type means the block’s contents are in HTML. That may be a complete document or (more commonly) an HTML fragment. This is how email messages formatted with different fonts, alignment, tables, etc. are commonly sent. A fuller explanation of this will be explored in a later article.

MIME-Version: 1.0
Content-Type: text/html;

<html>    
    <body>
        <h1>my sample html header</h1>
    </body>
</html>

Content-Type: image/jpeg (image/png, image/gif, etc.)

These image/* types indicate the content of the block is an image of some kind. The image may be rendered and visible as part of the email content or it might be an attachment the recipient will need to extract in order to view. In order to ensure the file integrity the bytes of the file need to be encoded for SMTP transmission. As with multi-byte characters, the most common encoding method is base64. Such a MIME block might look something like this:

MIME-Version: 1.0
Content-Type: image/jpeg
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smiley.jpg"

/9j/4AAQSkZJRgABAQEAYABgAAD/2wBDAAIBAQIBAQICAgICAgICAwUDAwMDAwYE
.
.  {content abbreviated}
.
baoX3hbS9T1W3v7nTbC4vrTHkXEtujyw4zjaxGVxk9D3NX6AP//Z

Also note, images are not the only media forms supported by MIME, video, audio, and application files are also sent using the same construction methods. This block type also introduces the Content-Disposition field to the block header. The choices are inline or attachment. As the values imply, the inline value tells the receiving client to render the content along with the rest of the email body. The exact form this may take is up to the client however. For images that should appear in specific places it is recommended to send them as attachments and use HTML formatting to place them as needed.

Examples, with PL/SQL code, of sending images will be in later articles in this series.

Content-Type: multipart/mixed

This type means there will be more than one MIME block in the body and the types of each can vary. For example, you would use this when sending a text message (plain or html-formatted) with a file attachment. One part would be the message and the second part would be the attachment. As mentioned above, when sending a multipart message a boundary must be defined and each block is delimited by that boundary. The header and body text would look something like this:

MIME-Version: 1.0
Content-Type:multipart/mixed; boundary="YQ2ATZ6TJHAYDLSB9YH8"

--YQ2ATZ6TJHAYDLSB9YH8
Content-type: text/plain; 

This is my plain text content.
Attached you should find a simple smiley face image.
--YQ2ATZ6TJHAYDLSB9YH8
Content-Type: image/jpeg
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smiley.jpg"

/9j/4AAQSkZJRgABAQEAYABgAAD/2wBDAAIBAQIBAQICAgICAgICAwUDAwMDAwYE
.
.  {content abbreviated}
.
baoX3hbS9T1W3v7nTbC4vrTHkXEtujyw4zjaxGVxk9D3NX6AP//Z
--YQ2ATZ6TJHAYDLSB9YH8--

Note each of the sub-blocks is constructed just as they would be if they were the entire content. The only difference is they are delimited by the boundary strings.

Content-Type: multipart/related

The last type we’ll look at is multipart/related content. That is, declaring multiple blocks that can reference each other. The most common usage for this type is using HTML which can then reference embedded images in a later block to position them within text, table, or other formatted output. Since it is a multipart region you must define a boundary. By default, the first block of related content is called the root. The root is rendered by the client and references the other blocks as needed to fill in. Each other than the root must have a Content-ID field with a value enclosed in angle-brackets (< >). If you want a block other than the first block to be the root you must give that block a Content-ID as well and then explicitly inform the client the block is the root with the start field in the main multipart/related header. In the example below I have the root as the first block but will make it explicit anyway.

Within each block, you can reference the content of another block through its content id using cid:xxxxxx syntax, where xxxxx is the Content-ID of the referenced block. Below is an outline of related content where an HTML block references two other blocks each of which contains an image.

MIME-Version: 1.0
Content-type:multipart/related; boundary="related-boundary"
start="<related-root>";

--related-boundary
Content-type: text/html;
Content-ID: <related-root>

...
image 1 <img src="cid:0123456789">
<br>
image 2 <img src="cid:987654321">
...
--related-boundary
Content-Type: image/jpeg
Content-Transfer-Encoding: base64
Content-ID: <0123456789>
Content-Location: image1.jpg

...
--related-boundary
Content-Type: image/jpeg
Content-Transfer-Encoding: base64
Content-ID: <987654321>
Content-Location: image2.jpg
...
--related-boundary--

There are other MIME types and subtypes but the ones above are the most common and will be the ones used in the articles and code samples in the remainder of this series.

Base64 Encoding

A few places above I mention base64 encoding as a means of transforming data into a form more easily processed by SMTP. For short RAW values, you can use UTL_ENCODE.base64_encode. That will return another RAW value with original bytes converted 3 at a time into their 4 byte base64 form. This RAW value can then be written directly into the SMTP server with UTL_SMTP.WRITE_RAW_DATA, or, my recommended method – converted to a VARCHAR2 value which can be appended to the email body before initiating the transmission. This will be discussed in the article on attachments.

UTL_ENCODE package only accepts and returns RAW values, which have a limit of 32767 bytes. Since base64 encoding increases the size by about a third, the maximum input value is limited to a little over 23K. For larger values, we would use a BLOB type for the input and we would expect a CLOB type as the output for use in constructing the email body before sending the email.

To encode a large BLOB value we will iterate through it extracting smaller RAW values, encode them, and then cast those to VARCHAR2 and append to a clob value. Since base64 transforms 3 input bytes at a time. Each chunk must have a maximum length that is a multiple of 3 to ensure we don’t split an input triplet. Also, the input must be small enough that the output of UTL_ENCODE.base64_encode will still fit inside a 32K RAW value. That value after being cast as a VARCHAR2 must also fit inside a 32K VARCHAR2 value in order to be appended to the output CLOB.

CREATE OR REPLACE FUNCTION encode_blob_as_base64_clob(p_blob IN BLOB)
    RETURN CLOB
IS
    -- The chunk size should be a large multiple of 3
    -- to reduce the number of iterations needed.
    c_chunk_size   CONSTANT PLS_INTEGER := 21000;
    v_len                   PLS_INTEGER;
    v_index                 PLS_INTEGER;
    v_chunk                 RAW(32767);
    v_buffer                VARCHAR2(32767);
    v_clob                  CLOB;
BEGIN
    v_len := DBMS_LOB.getlength(p_blob);
    v_index := 1;

    DBMS_LOB.createtemporary(v_clob, TRUE);

    WHILE v_index <= v_len
    LOOP
        -- The algorithm encodes 3 input bytes as 4 single-byte characters
        -- Thus we'll loop in chunks with a max size that is a multiple of 3.
        -- We want it to be large to reduce the number of calls
        -- but small enough that UTL_ENCODE.base64_encode can process them
        -- and the output can then be converted to a varchar2
        v_chunk := DBMS_LOB.SUBSTR(p_blob, c_chunk_size, v_index);
        v_index := v_index + c_chunk_size;

        v_buffer := UTL_RAW.cast_to_varchar2(UTL_ENCODE.base64_encode(v_chunk));

        DBMS_LOB.writeappend(v_clob, LENGTH(v_buffer), v_buffer);
    END LOOP;

    RETURN v_clob;
END;

The next two articles, discussing attachments and formatted content will make extensive use of MIME.

Thank you for reading. Questions and comments, as always, are welcome.