Rule #3: (White Space): Octets with values of 9 and 32 MAY be
represented as ASCII TAB (HT) and SPACE characters, respectively,
but MUST NOT be so represented at the end of an encoded line. Any
TAB (HT) or SPACE characters on an encoded line MUST thus be
followed on that line by a printable character. In particular, an
RFC 1521 MIME September 1993
"=" at the end of an encoded line, indicating a soft line break
(see rule #5) may follow one or more TAB (HT) or SPACE characters.
It follows that an octet with value 9 or 32 appearing at the end
of an encoded line must be represented according to Rule #1. This
rule is necessary because some MTAs (Message Transport Agents,
programs which transport messages from one user to another, or
perform a part of such transfers) are known to pad lines of text
with SPACEs, and others are known to remove "white space"
characters from the end of a line. Therefore, when decoding a
Quoted-Printable body, any trailing white space on a line must be
deleted, as it will necessarily have been added by intermediate
transport agents.
Rule #4 (Line Breaks): A line break in a text body, independent of
what its representation is following the canonical representation
of the data being encoded, must be represented by a (RFC 822) line
break, which is a CRLF sequence, in the Quoted-Printable encoding.
Since the canonical representation of types other than text do not
generally include the representation of line breaks, no hard line
breaks (i.e. line breaks that are intended to be meaningful and
to be displayed to the user) should occur in the quoted-printable
encoding of such types. Of course, occurrences of "=0D", "=0A",
"0A=0D" and "=0D=0A" will eventually be encountered. In general,
however, base64 is preferred over quoted-printable for binary
data.
Note that many implementations may elect to encode the local
representation of various content types directly, as described in
Appendix G. In particular, this may apply to plain text material
on systems that use newline conventions other than CRLF
delimiters. Such an implementation is permissible, but the
generation of line breaks must be generalized to account for the
case where alternate representations of newline sequences are
used.
Rule #5 (Soft Line Breaks): The Quoted-Printable encoding REQUIRES
that encoded lines be no more than 76 characters long. If longer
lines are to be encoded with the Quoted-Printable encoding, 'soft'
line breaks must be used. An equal sign as the last character on a
encoded line indicates such a non-significant ('soft') line break
in the encoded text. Thus if the "raw" form of the line is a
single unencoded line that says:
Now's the time for all folk to come to the aid of
their country.
This can be represented, in the Quoted-Printable encoding, as
RFC 1521 MIME September 1993
Now's the time =
for all folk to come=
to the aid of their country.
This provides a mechanism with which long lines are encoded in
such a way as to be restored by the user agent. The 76 character
limit does not count the trailing CRLF, but counts all other
characters, including any equal signs.
Since the hyphen character ("-") is represented as itself in the
Quoted-Printable encoding, care must be taken, when encapsulating a
quoted-printable encoded body in a multipart entity, to ensure that
the encapsulation boundary does not appear anywhere in the encoded
body. (A good strategy is to choose a boundary that includes a
character sequence such as "=_" which can never appear in a quoted-
printable body. See the definition of multipart messages later in
this document.)
NOTE: The quoted-printable encoding represents something of a
compromise between readability and reliability in transport.
Bodies encoded with the quoted-printable encoding will work
reliably over most mail gateways, but may not work perfectly over
a few gateways, notably those involving translation into EBCDIC.
(In theory, an EBCDIC gateway could decode a quoted-printable body
and re-encode it using base64, but such gateways do not yet
exist.) A higher level of confidence is offered by the base64
Content-Transfer-Encoding. A way to get reasonably reliable
transport through EBCDIC gateways is to also quote the ASCII
characters
=11= |