PROXY  WHOIS  RQUOTE  TEXTS  SOFT  FOREX  BBOARD
 Music  Philosophy  Code  Literature  Russian

= ROOT|Technical|RFC|rfc2046.txt =

page 5 of 25



   regarding line breaks detailed in the previous section must also be
   observed -- a character set whose definition does not conform to
   these rules cannot be used in a MIME "text" subtype.

   An initial list of predefined character set names can be found at the
   end of this section.  Additional character sets may be registered
   with IANA.

   Other media types than subtypes of "text" might choose to employ the
   charset parameter as defined here, but with the CRLF/line break
   restriction removed.  Therefore, all character sets that conform to
   the general definition of "character set" in RFC 2045 can be
   registered for MIME use.

   Note that if the specified character set includes 8-bit characters
   and such characters are used in the body, a Content-Transfer-Encoding
   header field and a corresponding encoding on the data are required in
   order to transmit the body via some mail transfer protocols, such as
   SMTP [RFC-821].

   The default character set, US-ASCII, has been the subject of some
   confusion and ambiguity in the past.  Not only were there some
   ambiguities in the definition, there have been wide variations in
   practice.  In order to eliminate such ambiguity and variations in the
   future, it is strongly recommended that new user agents explicitly
   specify a character set as a media type parameter in the Content-Type
   header field. "US-ASCII" does not indicate an arbitrary 7-bit
   character set, but specifies that all octets in the body must be
   interpreted as characters according to the US-ASCII character set.
   National and application-oriented versions of ISO 646 [ISO-646] are
   usually NOT identical to US-ASCII, and in that case their use in
   Internet mail is explicitly discouraged.  The omission of the ISO 646
   character set from this document is deliberate in this regard.  The
   character set name of "US-ASCII" explicitly refers to the character
   set defined in ANSI X3.4-1986 [US- ASCII].  The new international
   reference version (IRV) of the 1991 edition of ISO 646 is identical
   to US-ASCII.  The character set name "ASCII" is reserved and must not
   be used for any purpose.

   NOTE: RFC 821 explicitly specifies "ASCII", and references an earlier
   version of the American Standard.  Insofar as one of the purposes of
   specifying a media type and character set is to permit the receiver
   to unambiguously determine how the sender intended the coded message
   to be interpreted, assuming anything other than "strict ASCII" as the
   default would risk unintentional and incompatible changes to the
   semantics of messages now being transmitted.  This also implies that




 
RFC 2046                      Media Types                  November 1996


   messages containing characters coded according to other versions of
   ISO 646 than US-ASCII and the 1991 IRV, or using code-switching
   procedures (e.g., those of ISO 2022), as well as 8bit or multiple
   octet character encodings MUST use an appropriate character set
   specification to be consistent with MIME.

   The complete US-ASCII character set is listed in ANSI X3.4- 1986.
   Note that the control characters including DEL (0-31, 127) have no
   defined meaning in apart from the combination CRLF (US-ASCII values
   13 and 10) indicating a new line.  Two of the characters have de
   facto meanings in wide use: FF (12) often means "start subsequent
   text on the beginning of a new page"; and TAB or HT (9) often (though
   not always) means "move the cursor to the next available column after
   the current position where the column number is a multiple of 8
   (counting the first column as column 0)."  Aside from these
   conventions, any use of the control characters or DEL in a body must
   either occur

    (1)   because a subtype of text other than "plain"
          specifically assigns some additional meaning, or

    (2)   within the context of a private agreement between the
          sender and recipient. Such private agreements are
          discouraged and should be replaced by the other
          capabilities of this document.

   NOTE: An enormous proliferation of character sets exist beyond US-
   ASCII.  A large number of partially or totally overlapping character
   sets is NOT a good thing.  A SINGLE character set that can be used
   universally for representing all of the world's languages in Internet
   mail would be preferrable.  Unfortunately, existing practice in
   several communities seems to point to the continued use of multiple
   character sets in the near future.  A small number of standard
   character sets are, therefore, defined for Internet use in this
   document.

   The defined charset values are:

    (1)   US-ASCII -- as defined in ANSI X3.4-1986 [US-ASCII].

    (2)   ISO-8859-X -- where "X" is to be replaced, as
          necessary, for the parts of ISO-8859 [ISO-8859].  Note
          that the ISO 646 character sets have deliberately been
          omitted in favor of their 8859 replacements, which are
          the designated character sets for Internet mail.  As of
          the publication of this document, the legitimate values
=5=

1|2|3|4| < PREV = PAGE 5 = NEXT > |6|7|8|9|10|11|12|13|14.25

UP TO ROOT | UP TO DIR | TO FIRST PAGE

Google
 


E-mail Facebook Google Digg del.icio.us BlinkList Fark Furl Ma.gnolia Netscape NewsVine Reddit Slashdot Spurl StumbleUpon Technorati YahooMyWeb LiveJournal Blogmarks TwitThis Live News2.ru BobrDobr.ru Memori.ru MoeMesto.ru

0.017453 wallclock secs ( 0.00 usr + 0.01 sys = 0.01 CPU)