3.3. LEXICAL TOKENS
The following rules are used to define an underlying lexical
analyzer, which feeds tokens to higher level parsers. See the
ANSI references, in the Bibliography.
; ( Octal, Decimal.)
CHAR = <any ASCII character> ; ( 0-177, 0.-127.)
ALPHA = <any ASCII alphabetic character>
; (101-132, 65.- 90.)
; (141-172, 97.-122.)
DIGIT = <any ASCII decimal digit> ; ( 60- 71, 48.- 57.)
CTL = <any ASCII control ; ( 0- 37, 0.- 31.)
character and DEL> ; ( 177, 127.)
CR = <ASCII CR, carriage return> ; ( 15, 13.)
LF = <ASCII LF, linefeed> ; ( 12, 10.)
SPACE = <ASCII SP, space> ; ( 40, 32.)
HTAB = <ASCII HT, horizontal-tab> ; ( 11, 9.)
<"> = <ASCII quote mark> ; ( 42, 34.)
CRLF = CR LF
LWSP-char = SPACE / HTAB ; semantics = SPACE
linear-white-space = 1*([CRLF] LWSP-char) ; semantics = SPACE
; CRLF => folding
specials = "(" / ")" / "<" / ">" / "@" ; Must be in quoted-
/ "," / ";" / ":" / "\" / <"> ; string, to use
/ "." / "[" / "]" ; within a word.
delimiters = specials / linear-white-space / comment
text = <any CHAR, including bare ; => atoms, specials,
CR & bare LF, but NOT ; comments and
including CRLF> ; quoted-strings are
; NOT recognized.
atom = 1*<any CHAR except specials, SPACE and CTLs>
quoted-string = <"> *(qtext/quoted-pair) <">; Regular qtext or
; quoted chars.
qtext = <any CHAR excepting <">, ; => may be folded
"\" & CR, and including
linear-white-space>
domain-literal = "[" *(dtext / quoted-pair) "]"
August 13, 1982 - 10 - RFC #822
Standard for ARPA Internet Text Messages
dtext = <any CHAR excluding "[", ; => may be folded
"]", "\" & CR, & including
linear-white-space>
comment = "(" *(ctext / quoted-pair / comment) ")"
ctext = <any CHAR excluding "(", ; => may be folded
")", "\" & CR, & including
linear-white-space>
quoted-pair = "\" CHAR ; may quote any char
phrase = 1*word ; Sequence of words
word = atom / quoted-string
3.4. CLARIFICATIONS
3.4.1. QUOTING
Some characters are reserved for special interpretation, such
as delimiting lexical tokens. To permit use of these charac-
ters as uninterpreted data, a quoting mechanism is provided.
To quote a character, precede it with a backslash ("\").
This mechanism is not fully general. Characters may be quoted
only within a subset of the lexical constructs. In particu-
lar, quoting is limited to use within:
- quoted-string
- domain-literal
- comment
Within these constructs, quoting is REQUIRED for CR and "\"
and for the character(s) that delimit the token (e.g., "(" and
")" for a comment). However, quoting is PERMITTED for any
character.
Note: In particular, quoting is NOT permitted within atoms.
=8= |