unreserved = alpha | digit | safe | extra
escape = "%" hex hex
hex = digit | "A" | "B" | "C" | "D" | "E" | "F" |
"a" | "b" | "c" | "d" | "e" | "f"
alpha = lowalpha | hialpha
lowalpha = "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | "i" |
"j" | "k" | "l" | "m" | "n" | "o" | "p" | "q" | "r" |
"s" | "t" | "u" | "v" | "w" | "x" | "y" | "z"
hialpha = "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | "I" |
"J" | "K" | "L" | "M" | "N" | "O" | "P" | "Q" | "R" |
"S" | "T" | "U" | "V" | "W" | "X" | "Y" | "Z"
digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" |
"8" | "9"
safe = "$" | "-" | "_" | "." | "+"
extra = "!" | "*" | "'" | "(" | ")" | ","
national = "{" | "}" | "|" | "\" | "^" | "~" | "[" | "]" | "`"
reserved = ";" | "/" | "?" | ":" | "@" | "&" | "="
punctuation = "<" | ">" | "#" | "%" | <">
RFC 1808 Relative Uniform Resource Locators June 1995
2.3. Specific Schemes and their Syntactic Categories
Each URL scheme has its own rules regarding the presence or absence
of the syntactic components described in Sections 2.1 and 2.2. In
addition, some schemes are never appropriate for use with relative
URLs. However, since relative URLs will only be used within contexts
in which they are useful, these scheme-specific differences can be
ignored by the resolution process.
Within this section, we include as examples only those schemes that
have a defined URL syntax in RFC 1738 [2]. The following schemes are
never used with relative URLs:
mailto Electronic Mail
news USENET news
telnet TELNET Protocol for Interactive Sessions
Some URL schemes allow the use of reserved characters for purposes
outside the generic-RL syntax given above. However, such use is
rare. Relative URLs can be used with these schemes whenever the
applicable base URL follows the generic-RL syntax.
gopher Gopher and Gopher+ Protocols
prospero Prospero Directory Service
wais Wide Area Information Servers Protocol
Users of gopher URLs should note that gopher-type information is
almost always included at the beginning of what would be the
generic-RL path. If present, this type information prevents
relative-path references to documents with differing gopher-types.
Finally, the following schemes can always be parsed using the
generic-RL syntax. This does not necessarily imply that relative
URLs will be useful with these schemes -- that decision is left to
the system implementation and the author of the base document.
file Host-specific Files
ftp File Transfer Protocol
http Hypertext Transfer Protocol
nntp USENET news using NNTP access
NOTE: Section 5 of RFC 1738 specifies that the question-mark
character ("?") is allowed in an ftp or file path segment.
However, this is not true in practice and is believed to be an
error in the RFC. Similarly, RFC 1738 allows the reserved
character semicolon (";") within an http path segment, but does
not define its semantics; the correct semantics are as defined
by this document for .
RFC 1808 Relative Uniform Resource Locators June 1995
We recommend that new schemes be designed to be parsable via the
generic-RL syntax if they are intended to be used with relative URLs.
A description of the allowed relative forms should be included when a
new scheme is registered, as per Section 4 of RFC 1738 [2].
2.4. Parsing a URL
An accepted method for parsing URLs is useful to clarify the
generic-RL syntax of Section 2.2 and to describe the algorithm for
resolving relative URLs presented in Section 4. This section
describes the parsing rules for breaking down a URL (relative or
absolute) into the component parts described in Section 2.1. The
rules assume that the URL has already been separated from any
surrounding text and copied to a "parse string". The rules are
=3= |