RFC 2396 URI Generic Syntax August 1998
G. Summary of Non-editorial Changes
G.1. Additions
Section 4 (URI References) was added to stem the confusion regarding
"what is a URI" and how to describe fragment identifiers given that
they are not part of the URI, but are part of the URI syntax and
parsing concerns. In addition, it provides a reference definition
for use by other IETF specifications (HTML, HTTP, etc.) that have
previously attempted to redefine the URI syntax in order to account
for the presence of fragment identifiers in URI references.
Section 2.4 was rewritten to clarify a number of misinterpretations
and to leave room for fully internationalized URI.
Appendix F on abbreviated URLs was added to describe the shortened
references often seen on television and magazine advertisements and
explain why they are not used in other contexts.
G.2. Modifications from both RFC 1738 and RFC 1808
Changed to URI syntax instead of just URL.
Confusion regarding the terms "character encoding", the URI
"character set", and the escaping of characters with %
equivalents has (hopefully) been reduced. Many of the BNF rule names
regarding the character sets have been changed to more accurately
describe their purpose and to encompass all "characters" rather than
just US-ASCII octets. Unless otherwise noted here, these
modifications do not affect the URI syntax.
Both RFC 1738 and RFC 1808 refer to the "reserved" set of characters
as if URI-interpreting software were limited to a single set of
characters with a reserved purpose (i.e., as meaning something other
than the data to which the characters correspond), and that this set
was fixed by the URI scheme. However, this has not been true in
practice; any character that is interpreted differently when it is
escaped is, in effect, reserved. Furthermore, the interpreting
engine on a HTTP server is often dependent on the resource, not just
the URI scheme. The description of reserved characters has been
changed accordingly.
The plus "+", dollar "$", and comma "," characters have been added to
those in the "reserved" set, since they are treated as reserved
within the query component.
RFC 2396 URI Generic Syntax August 1998
The tilde "~" character was added to those in the "unreserved" set,
since it is extensively used on the Internet in spite of the
difficulty to transcribe it with some keyboards.
The syntax for URI scheme has been changed to require that all
schemes begin with an alpha character.
The "user:password" form in the previous BNF was changed to a
"userinfo" token, and the possibility that it might be
"user:password" made scheme specific. In particular, the use of
passwords in the clear is not even suggested by the syntax.
The question-mark "?" character was removed from the set of allowed
characters for the userinfo in the authority component, since testing
showed that many applications treat it as reserved for separating the
query component from the rest of the URI.
The semicolon ";" character was added to those stated as being
reserved within the authority component, since several new schemes
are using it as a separator within userinfo to indicate the type of
user authentication.
=21= |