package HTML::Element;
=head1 NAME
HTML::Element - Class for objects that represent HTML elements
=head1 VERSION
Version 3.23
=cut
use vars qw( $VERSION );
$VERSION = '3.23';
=head1 SYNOPSIS
use HTML::Element;
$a = HTML::Element->new('a', href => 'http://www.perl.com/');
$a->push_content("The Perl Homepage");
$tag = $a->tag;
print "$tag starts out as:", $a->starttag, "\n";
print "$tag ends as:", $a->endtag, "\n";
print "$tag\'s href attribute is: ", $a->attr('href'), "\n";
$links_r = $a->extract_links();
print "Hey, I found ", scalar(@$links_r), " links.\n";
print "And that, as HTML, is: ", $a->as_HTML, "\n";
$a = $a->delete;
=head1 DESCRIPTION
(This class is part of the L<HTML::Tree|HTML::Tree> dist.)
Objects of the HTML::Element class can be used to represent elements
of HTML document trees. These objects have attributes, notably attributes that
designates each element's parent and content. The content is an array
of text segments and other HTML::Element objects. A tree with HTML::Element
objects as nodes can represent the syntax tree for a HTML document.
=head1 HOW WE REPRESENT TREES
Consider this HTML document:
<html lang='en-US'>
<head>
<title>Stuff</title>
<meta name='author' content='Jojo'>
</head>
<body>
<h1>I like potatoes!</h1>
</body>
</html>
Building a syntax tree out of it makes a tree-structure in memory
that could be diagrammed as:
html (lang='en-US')
/ \
/ \
/ \
head body
/\ \
/ \ \
/ \ \
title meta h1
| (name='author', |
"Stuff" content='Jojo') "I like potatoes"
This is the traditional way to diagram a tree, with the "root" at the
top, and it's this kind of diagram that people have in mind when they
say, for example, that "the meta element is under the head element
instead of under the body element". (The same is also said with
"inside" instead of "under" -- the use of "inside" makes more sense
when you're looking at the HTML source.)
Another way to represent the above tree is with indenting:
html (attributes: lang='en-US')
head
title
"Stuff"
meta (attributes: name='author' content='Jojo')
body
h1
"I like potatoes"
Incidentally, diagramming with indenting works much better for very
large trees, and is easier for a program to generate. The C<< $tree->dump >>
method uses indentation just that way.
However you diagram the tree, it's stored the same in memory -- it's a
network of objects, each of which has attributes like so:
element #1: _tag: 'html'
_parent: none
_content: [element #2, element #5]
lang: 'en-US'
=1= |