query components are undefined, then it is a reference to the
current document and we are done. Otherwise, the reference URI's
query and fragment components are defined as found (or not found)
within the URI reference and not inherited from the base URI.
3) If the scheme component is defined, indicating that the reference
starts with a scheme name, then the reference is interpreted as an
absolute URI and we are done. Otherwise, the reference URI's
scheme is inherited from the base URI's scheme component.
Due to a loophole in prior specifications [RFC1630], some parsers
allow the scheme name to be present in a relative URI if it is the
same as the base URI scheme. Unfortunately, this can conflict
with the correct parsing of non-hierarchical URI. For backwards
compatibility, an implementation may work around such references
by removing the scheme if it matches that of the base URI and the
scheme is known to always use the syntax. The parser
RFC 2396 URI Generic Syntax August 1998
can then continue with the steps below for the remainder of the
reference components. Validating parsers should mark such a
misformed relative reference as an error.
4) If the authority component is defined, then the reference is a
network-path and we skip to step 7. Otherwise, the reference
URI's authority is inherited from the base URI's authority
component, which will also be undefined if the URI scheme does not
use an authority component.
5) If the path component begins with a slash character ("/"), then
the reference is an absolute-path and we skip to step 7.
6) If this step is reached, then we are resolving a relative-path
reference. The relative path needs to be merged with the base
URI's path. Although there are many ways to do this, we will
describe a simple method using a separate string buffer.
a) All but the last segment of the base URI's path component is
copied to the buffer. In other words, any characters after the
last (right-most) slash character, if any, are excluded.
b) The reference's path component is appended to the buffer
string.
c) All occurrences of "./", where "." is a complete path segment,
are removed from the buffer string.
d) If the buffer string ends with "." as a complete path segment,
that "." is removed.
e) All occurrences of "/../", where is a
complete path segment not equal to "..", are removed from the
buffer string. Removal of these path segments is performed
iteratively, removing the leftmost matching pattern on each
iteration, until no matching pattern remains.
f) If the buffer string ends with "/..", where
is a complete path segment not equal to "..", that
"/.." is removed.
g) If the resulting buffer string still begins with one or more
complete path segments of "..", then the reference is
considered to be in error. Implementations may handle this
error by retaining these components in the resolved path (i.e.,
treating them as part of the final URI), by removing them from
the resolved path (i.e., discarding relative levels above the
root), or by avoiding traversal of the reference.
RFC 2396 URI Generic Syntax August 1998
h) The remaining buffer string is the reference URI's new path
component.
7) The resulting URI components, including any inherited from the
base URI, are recombined to give the absolute form of the URI
reference. Using pseudocode, this would be
result = ""
if scheme is defined then
append scheme to result
append ":" to result
if authority is defined then
append "//" to result
append authority to result
append path to result
=12= |