!"#$@[\]^`{|}~
according to rule #1. See Appendix B for more information.
Because quoted-printable data is generally assumed to be line-
oriented, it is to be expected that the representation of the breaks
between the lines of quoted printable data may be altered in
transport, in the same manner that plain text mail has always been
altered in Internet mail when passing between systems with differing
newline conventions. If such alterations are likely to constitute a
corruption of the data, it is probably more sensible to use the
base64 encoding rather than the quoted-printable encoding.
WARNING TO IMPLEMENTORS: If binary data are encoded in quoted-
printable, care must be taken to encode CR and LF characters as "=0D"
and "=0A", respectively. In particular, a CRLF sequence in binary
data should be encoded as "=0D=0A". Otherwise, if CRLF were
represented as a hard line break, it might be incorrectly decoded on
RFC 1521 MIME September 1993
platforms with different line break conventions.
For formalists, the syntax of quoted-printable data is described by
the following grammar:
quoted-printable := ([*(ptext / SPACE / TAB) ptext] ["="] CRLF)
; Maximum line length of 76 characters excluding CRLF
ptext := octet /<any ASCII character except "=", SPACE, or TAB>
; characters not listed as "mail-safe" in Appendix B
; are also not recommended.
octet := "=" 2(DIGIT / "A" / "B" / "C" / "D" / "E" / "F")
; octet must be used for characters > 127, =, SPACE, or TAB,
; and is recommended for any characters not listed in
; Appendix B as "mail-safe".
5.2. Base64 Content-Transfer-Encoding
The Base64 Content-Transfer-Encoding is designed to represent
arbitrary sequences of octets in a form that need not be humanly
readable. The encoding and decoding algorithms are simple, but the
encoded data are consistently only about 33 percent larger than the
unencoded data. This encoding is virtually identical to the one used
in Privacy Enhanced Mail (PEM) applications, as defined in RFC 1421.
The base64 encoding is adapted from RFC 1421, with one change: base64
eliminates the "*" mechanism for embedded clear text.
A 65-character subset of US-ASCII is used, enabling 6 bits to be
represented per printable character. (The extra 65th character, "=",
is used to signify a special processing function.)
NOTE: This subset has the important property that it is
represented identically in all versions of ISO 646, including US
ASCII, and all characters in the subset are also represented
identically in all versions of EBCDIC. Other popular encodings,
such as the encoding used by the uuencode utility and the base85
encoding specified as part of Level 2 PostScript, do not share
these properties, and thus do not fulfill the portability
requirements a binary transport encoding for mail must meet.
The encoding process represents 24-bit groups of input bits as output
strings of 4 encoded characters. Proceeding from left to right, a
24-bit input group is formed by concatenating 3 8-bit input groups.
These 24 bits are then treated as 4 concatenated 6-bit groups, each
of which is translated into a single digit in the base64 alphabet.
When encoding a bit stream via the base64 encoding, the bit stream
must be presumed to be ordered with the most-significant-bit first.
RFC 1521 MIME September 1993
That is, the first bit in the stream will be the high-order bit in
the first byte, and the eighth bit will be the low-order bit in the
first byte, and so on.
Each 6-bit group is used as an index into an array of 64 printable
characters. The character referenced by the index is placed in the
output string. These characters, identified in Table 1, below, are
selected so as to be universally representable, and the set excludes
characters with particular significance to SMTP (e.g., ".", CR, LF)
and to the encapsulation boundaries defined in this document (e.g.,
"-").
Table 1: The Base64 Alphabet
Value Encoding Value Encoding Value Encoding Value Encoding
0 A 17 R 34 i 51 z
1 B 18 S 35 j 52 0
2 C 19 T 36 k 53 1
=12= |