The XML specification is a work product of the World Wide Web
Consortium's XML Working Group, and was edited by:
Tim Bray <tbray@textuality.com>
Jean Paoli <jeanpa@microsoft.com>
C. M. Sperberg-McQueen <cmsmcq@uic.edu>
The W3C, and the W3C XML working group, has change control over
the XML specification.
4 Security Considerations
XML, as a subset of SGML, has the same security considerations as
specified in [RFC-1874].
To paraphrase section 3 of [RFC-1874], XML entities contain
information to be parsed and processed by the recipient's XML system.
These entities may contain and such systems may permit explicit
system level commands to be executed while processing the data. To
the extent that an XML system will execute arbitrary command strings,
recipients of XML entities may be at risk. In general, it may be
possible to specify commands that perform unauthorized file
operations or make changes to the display processor's environment
that affect subsequent operations.
Use of XML is expected to be varied, and widespread. XML is under
scrutiny by a wide range of communities for use as a common syntax
for community-specific metadata. For example, the Dublin Core group
is using XML for document metadata, and a new effort has begun which
is considering use of XML for medical information. Other groups view
XML as a mechanism for marshalling parameters for remote procedure
calls. More uses of XML will undoubtedly arise.
Security considerations will vary by domain of use. For example, XML
medical records will have much more stringent privacy and security
considerations than XML library metadata. Similarly, use of XML as a
parameter marshalling syntax necessitates a case by case security
review.
XML may also have some of the same security concerns as plain text.
Like plain text, XML can contain escape sequences which, when
displayed, have the potential to change the display processor
environment in ways that adversely affect subsequent operations.
Possible effects include, but are not limited to, locking the
keyboard, changing display parameters so subsequent displayed text is
unreadable, or even changing display parameters to deliberately
RFC 2376 XML Media Types July 1998
obscure or distort subsequent displayed material so that its meaning
is lost or altered. Display processors should either filter such
material from displayed text or else make sure to reset all important
settings after a given display operation is complete.
Some terminal devices have keys whose output, when pressed, can be
changed by sending the display processor a character sequence. If
this is possible the display of a text object containing such
character sequences could reprogram keys to perform some illicit or
dangerous action when the key is subsequently pressed by the user.
In some cases not only can keys be programmed, they can be triggered
remotely, making it possible for a text display operation to directly
perform some unwanted action. As such, the ability to program keys
should be blocked either by filtering or by disabling the ability to
program keys entirely.
Note that it is also possible to construct XML documents which make
use of what XML terms "entity references" (using the XML meaning of
the term "entity", which differs from the MIME definition of this
term), to construct repeated expansions of text. Recursive expansions
are prohibited [REC-XML] and XML processors are required to detect
them. However, even non-recursive expansions may cause problems with
the finite computing resources of computers, if they are performed
many times.
5 The Byte Order Mark (BOM) and Conversions to/from UTF-16
The XML Recommendation, in section 4.3.3, specifies that UTF-16 XML
entities must begin with a byte order mark (BOM), which is the ZERO
WIDTH NO-BREAK SPACE character, hexadecimal sequence 0xFEFF (or
0xFFFE, depending on endian). The XML Recommendation further states
that the BOM is an encoding signature, and is not part of either the
markup or the character data of the XML document.
Due to the BOM, applications which convert XML from the UTF-16
encoding to another encoding SHOULD strip the BOM before conversion.
Similarly, when converting from another encoding into UTF-16, the BOM
SHOULD be added after conversion is complete.
6 Examples
The examples below give the value of the Content-type MIME header and
the XML declaration (which includes the encoding declaration) inside
the XML entity. For UTF-16 examples, the Byte Order Mark character
is denoted as "{BOM}", and the XML declaration is assumed to come at
the beginning of the XML entity, immediately following the BOM. Note
=5= |