It is useful to consider an example of how the base URI of a document
can be embedded within the document's content. In this appendix, we
describe how documents written in the Hypertext Markup Language
(HTML) [RFC1866] can include an embedded base URI. This appendix
does not form a part of the URI specification and should not be
considered as anything more than a descriptive example.
HTML defines a special element "BASE" which, when present in the
"HEAD" portion of a document, signals that the parser should use the
BASE element's "HREF" attribute as the base URI for resolving any
relative URI. The "HREF" attribute must be an absolute URI. Note
that, in HTML, element and attribute names are case-insensitive. For
example:
<!doctype html public "-//IETF//DTD HTML//EN">
An example HTML document
<BASE href="http://www.ics.uci.edu/Test/a/b/c">
... <A href="../x">a hypertext anchor ...
A parser reading the example document should interpret the given
relative URI "../x" as representing the absolute URI
<http://www.ics.uci.edu/Test/a/x>
regardless of the context in which the example document was obtained.
RFC 2396 URI Generic Syntax August 1998
E. Recommendations for Delimiting URI in Context
URI are often transmitted through formats that do not provide a clear
context for their interpretation. For example, there are many
occasions when URI are included in plain text; examples include text
sent in electronic mail, USENET news messages, and, most importantly,
printed on paper. In such cases, it is important to be able to
delimit the URI from the rest of the text, and in particular from
punctuation marks that might be mistaken for part of the URI.
In practice, URI are delimited in a variety of ways, but usually
within double-quotes "http://test.com/", angle brackets
<http://test.com/>, or just using whitespace
http://test.com/
These wrappers do not form part of the URI.
In the case where a fragment identifier is associated with a URI
reference, the fragment would be placed within the brackets as well
(separated from the URI with a "#" character).
In some cases, extra whitespace (spaces, linebreaks, tabs, etc.) may
need to be added to break long URI across lines. The whitespace
should be ignored when extracting the URI.
No whitespace should be introduced after a hyphen ("-") character.
Because some typesetters and printers may (erroneously) introduce a
hyphen at the end of line when breaking a line, the interpreter of a
URI containing a line break immediately after a hyphen should ignore
all unescaped whitespace around the line break, and should be aware
that the hyphen may or may not actually be part of the URI.
Using <> angle brackets around each URI is especially recommended as
a delimiting style for URI that contain whitespace.
The prefix "URL:" (with or without a trailing space) was recommended
as a way to used to help distinguish a URL from other bracketed
designators, although this is not common in practice.
For robustness, software that accepts user-typed URI should attempt
to recognize and strip both delimiters and embedded whitespace.
For example, the text:
=19= |