and at most m occurrences of the rule. n and m are optional
decimal values with default values of 0 and infinity respectively.
[rule]
An element enclosed in square brackets ('[' and ']') is optional,
and is equivalent to '*1 rule'.
N rule
A rule preceded by a decimal number represents exactly N
occurrences of the rule. It is equivalent to 'N*N rule'.
2.2. Basic Rules
This specification uses a BNF-like grammar defined in terms of
characters. Unlike many specifications which define the bytes
allowed by a protocol, here each literal in the grammar corresponds
to the character it represents. How these characters are represented
in terms of bits and bytes within a system are either system-defined
or specified in the particular context. The single exception is the
rule 'OCTET', defined below.
The following rules are used throughout this specification to
describe basic parsing constructs.
alpha = lowalpha | hialpha
lowalpha = "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" |
"i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" |
"q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" |
"y" | "z"
hialpha = "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" |
"I" | "J" | "K" | "L" | "M" | "N" | "O" | "P" |
"Q" | "R" | "S" | "T" | "U" | "V" | "W" | "X" |
"Y" | "Z"
RFC 3875 CGI Version 1.1 October 2004
digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" |
"8" | "9"
alphanum = alpha | digit
OCTET = <any 8-bit byte>
CHAR = alpha | digit | separator | "!" | "#" | "$" |
"%" | "&" | "'" | "*" | "+" | "-" | "." | "`" |
"^" | "_" | "{" | "|" | "}" | "~" | CTL
CTL = <any control character>
SP = <space character>
HT = <horizontal tab character>
NL =
LWSP = SP | HT | NL
separator = "(" | ")" | "<" | ">" | "@" | "," | ";" | ":" |
"\" | <"> | "/" | "[" | "]" | "?" | "=" | "{" |
"}" | SP | HT
token = 1*<any CHAR except CTLs or separators>
quoted-string = <"> *qdtext <">
qdtext = <any CHAR except <"> and CTLs but including LWSP>
TEXT = <any printable character>
Note that newline (NL) need not be a single control character, but
can be a sequence of control characters. A system MAY define TEXT to
be a larger set of characters than <any CHAR excluding CTLs but
including LWSP>.
2.3. URL Encoding
Some variables and constructs used here are described as being
'URL-encoded'. This encoding is described in section 2 of RFC 2396
[2]. In a URL-encoded string an escape sequence consists of a
percent character ("%") followed by two hexadecimal digits, where the
two hexadecimal digits form an octet. An escape sequence represents
the graphic character that has the octet as its code within the
US-ASCII [9] coded character set, if it exists. Currently there is
no provision within the URI syntax to identify which character set
non-ASCII codes represent, so CGI handles this issue on an ad-hoc
basis.
Note that some unsafe (reserved) characters may have different
semantics when encoded. The definition of which characters are
unsafe depends on the context; see section 2 of RFC 2396 [2], updated
by RFC 2732 [7], for an authoritative treatment. These reserved
characters are generally used to provide syntactic structure to the
character string, for example as field separators. In all cases, the
string is first processed with regard to any reserved characters
present, and then the resulting data can be URL-decoded by replacing
"%" escape sequences by their character values.
RFC 3875 CGI Version 1.1 October 2004
To encode a character string, all reserved and forbidden characters
are replaced by the corresponding "%" escape sequences. The string
=4= |