Content-type: application/xml; charset="iso-2022-kr"
<?xml version="1.0" encoding="iso-2022-kr"?>
This example shows application/xml with a Korean charset (e.g.,
Hangul) encoded following the specification in [RFC-1557]. Since the
charset parameter is provided, MIME and XML processors must treat the
enclosed entity as encoded per [RFC-1557], independent of whether the
XML entity has an internal encoding declaration (this example does
show such a declaration, which agrees with the charset parameter).
Since ISO-2022-KR has been defined to use only 7 bits of data, no
content-transfer-encoding is necessary with any transport.
RFC 2376 XML Media Types July 1998
6.7 application/xml with Omitted Charset and UTF-16 XML Entity
Content-type: application/xml
{BOM}<?xml version='1.0'?>
For this example, the XML entity begins with a BOM. Since the
charset has been omitted, a conforming XML processor follows the
requirements of [REC-XML], section 4.3.3. Specifically, the XML
processor reads the BOM, and thus knows deterministically that the
charset encoding is UTF-16.
An XML-unaware MIME processor should make no assumptions about the
charset of the XML entity.
6.8 application/xml with Omitted Charset and UTF-8 Entity
Content-type: application/xml
<?xml version='1.0'?>
In this example, the charset parameter has been omitted, and there is
no BOM. Since there is no BOM, the XML processor follows the
requirements in section 4.3.3, and optionally applies the mechanism
described in appendix F (which is non-normative) of [REC-XML] to
determine the charset encoding of UTF-8. The XML entity does not
contain an encoding declaration, but since the encoding is UTF-8,
this is still a conforming XML entity.
An XML-unaware MIME processor should make no assumptions about the
charset of the XML entity.
6.9 application/xml with Omitted Charset and Internal Encoding
Declaration
Content-type: application/xml
<?xml version='1.0' encoding="ISO-10646-UCS-4"?>
In this example, the charset parameter has been omitted, and there is
no BOM. However, the XML entity does have an encoding declaration
inside the XML entity which specifies the entity's charset. Following
the requirements in section 4.3.3, and optionally applying the
mechanism described in appendix F (non-normative) of [REC-XML], the
XML processor determines the charset encoding of the XML entity (in
this example, UCS-4).
RFC 2376 XML Media Types July 1998
An XML-unaware MIME processor should make no assumptions about the
charset of the XML entity.
7 References
[ISO-10646] ISO/IEC, Information Technology - Universal Multiple-
Octet Coded Character Set (UCS) - Part 1: Architecture
and Basic Multilingual Plane, May 1993.
[ISO-8897] ISO (International Organization for Standardization) ISO
8879:1986(E) Information Processing -- Text and Office
Systems -- Standard Generalized Markup Language (SGML).
First edition -- 1986- 10-15.
[REC-XML] T. Bray, J. Paoli, C. M. Sperberg-McQueen, "Extensible
Markup Language (XML)" World Wide Web Consortium
Recommendation REC- xml-19980210.
http://www.w3.org/TR/1998/REC-xml-19980210.
[RFC-1557] Choi, U., Chon, K., and H. Park. "Korean Character
Encoding for Internet Messages", RFC 1557. December,
1993.
=7= |