abc < def => "abc ","<"," def"
abc < def => "abc ","<"," def"
An ampersand is only recognized as markup when it is followed by a
letter or a `#' and a digit:
abc & lt def => "abc & lt def"
abc 60 def => "abc 60 def"
A useful technique for translating plain text to HTML is to replace
each '<', '&', and '>' by an entity reference or numeric character
reference as follows:
ENTITY NUMERIC
CHARACTER REFERENCE CHAR REF CHARACTER DESCRIPTION
--------- ---------- ----------- ---------------------
& & & Ampersand
< < < Less than
> > > Greater than
NOTE - There are SGML mechanisms, CDATA and RCDATA
declared content, that allow most `<', `>', and `&'
characters to be entered without the use of entity
references. Because these mechanisms tend to be used and
implemented inconsistently, and because they conflict
RFC 1866 Hypertext Markup Language - 2.0 November 1995
with techniques for reducing HTML to 7 bit ASCII for
transport, they are deprecated in this version of HTML.
See 5.5.2.1, "Example and Listing: XMP, LISTING".
3.2.2. Tags
Tags delimit elements such as headings, paragraphs, lists, character
highlighting, and links. Most HTML elements are identified in a
document as a start-tag, which gives the element name and attributes,
followed by the content, followed by the end tag. Start-tags are
delimited by `<' and `>'; end tags are delimited by `'. An
example is:
This is a Heading
Some elements only have a start-tag without an end-tag. For example,
to create a line break, use the `' tag. Additionally, the end
tags of some other elements, such as Paragraph (`'), List Item
(`'), Definition Term (`'), and Definition Description
(`') elements, may be omitted.
The content of an element is a sequence of data character strings and
nested elements. Some elements, such as anchors, cannot be nested.
Anchors and character highlighting may be put inside other
constructs. See the HTML DTD, 9.1, "HTML DTD" for full details.
NOTE - The SGML declaration for HTML specifies SHORTTAG YES, which
means that there are other valid syntaxes for tags, such as NET
tags, `<EM/.../'; empty start tags, `<>'; and empty end-tags,
`'. Until support for these idioms is widely deployed, their
use is strongly discouraged.
3.2.3. Names
A name consists of a letter followed by letters, digits, periods, or
hyphens. The length of a name is limited to 72 characters by the
`NAMELEN' parameter in the SGML declaration for HTML, 9.5, "SGML
Declaration for HTML". Element and attribute names are not case
sensitive, but entity names are. For example, `',
`', and `' are equivalent, whereas `&' is
different from `&'.
In a start-tag, the element name must immediately follow the tag open
delimiter `<'.
RFC 1866 Hypertext Markup Language - 2.0 November 1995
3.2.4. Attributes
In a start-tag, white space and attributes are allowed between the
element name and the closing delimiter. An attribute specification
typically consists of an attribute name, an equal sign, and a value,
though some attribute specifications may be just a name token. White
space is allowed around the equal sign.
The value of the attribute may be either:
=8= |