Mojo::DOM(3) User Contributed Perl Documentation Mojo::DOM(3)NAMEMojo::DOM - Minimalistic HTML5/XML DOM Parser With CSS3 Selectors
SYNOPSIS
use Mojo::DOM;
# Parse
my $dom = Mojo::DOM->new('<div><p id="a">A</p><p id="b">B</p></div>');
# Find
my $b = $dom->at('#b');
print $b->text;
# Walk
print $dom->div->p->[0]->text;
print $dom->div->p->[1]->{id};
# Iterate
$dom->find('p[id]')->each(sub { print shift->{id} });
# Loop
for my $e ($dom->find('p[id]')->each) {
print $e->text;
}
# Modify
$dom->div->p->[1]->append('<p id="c">C</p>');
# Render
print $dom;
DESCRIPTIONMojo::DOM is a minimalistic and relaxed HTML5/XML DOM parser with CSS3
selector support. It will even try to interpret broken XML, so you
should not use it for validation.
CASE SENSITIVITYMojo::DOM defaults to HTML5 semantics, that means all tags and
attributes are lowercased and selectors need to be lowercase as well.
my $dom = Mojo::DOM->new('<P ID="greeting">Hi!</P>');
print $dom->at('p')->text;
print $dom->p->{id};
If XML processing instructions are found, the parser will automatically
switch into XML mode and everything becomes case sensitive.
my $dom = Mojo::DOM->new('<?xml version="1.0"?><P ID="greeting">Hi!</P>');
print $dom->at('P')->text;
print $dom->P->{ID};
XML detection can be also deactivated with the "xml" method.
# XML sematics
$dom->xml(1);
# HTML5 semantics
$dom->xml(0);
METHODSMojo::DOM inherits all methods from Mojo::Base and implements the
following new ones.
"new"
my $dom = Mojo::DOM->new;
my $dom = Mojo::DOM->new(xml => 1);
my $dom = Mojo::DOM->new('<foo bar="baz">test</foo>');
my $dom = Mojo::DOM->new('<foo bar="baz">test</foo>', xml => 1);
Construct a new Mojo::DOM object.
"all_text"
my $trimmed = $dom->all_text;
my $untrimmed = $dom->all_text(0);
Extract all text content from DOM structure, smart whitespace trimming
is activated by default. Note that the trim argument of this method is
EXPERIMENTAL and might change without warning!
"append"
$dom = $dom->append('<p>Hi!</p>');
Append to element.
# "<div><h1>A</h1><h2>B</h2></div>"
$dom->parse('<div><h1>A</h1></div>')->at('h1')->append('<h2>B</h2>');
"append_content"
$dom = $dom->append_content('<p>Hi!</p>');
Append to element content.
# "<div><h1>AB</h1></div>"
$dom->parse('<div><h1>A</h1></div>')->at('h1')->append_content('B');
"at"
my $result = $dom->at('html title');
Find a single element with CSS3 selectors. All selectors from
Mojo::DOM::CSS are supported.
"attrs"
my $attrs = $dom->attrs;
my $foo = $dom->attrs('foo');
$dom = $dom->attrs({foo => 'bar'});
$dom = $dom->attrs(foo => 'bar');
Element attributes.
# Direct hash access to attributes is also available
print $dom->{foo};
print $dom->div->{id};
"charset"
my $charset = $dom->charset;
$dom = $dom->charset('UTF-8');
Charset used for decoding and encoding HTML5/XML.
"children"
my $collection = $dom->children;
my $collection = $dom->children('div')
Return a Mojo::Collection object containing the children of this
element, similar to "find".
# Child elements are also automatically available as object methods
print $dom->div->text;
print $dom->div->[23]->text;
$dom->div->each(sub { print $_->text });
"content_xml"
my $xml = $dom->content_xml;
Render content of this element to XML.
"find"
my $collection = $dom->find('html title');
Find elements with CSS3 selectors and return a Mojo::Collection object.
All selectors from Mojo::DOM::CSS are supported.
# Find a specific element and extract information
my $id = $dom->find('div')->[23]->{id};
# Extract information from multiple elements
my @headers = $dom->find('h1, h2, h3')->map(sub { shift->text })->each;
"namespace"
my $namespace = $dom->namespace;
Find element namespace.
"parent"
my $parent = $dom->parent;
Parent of element.
"parse"
$dom = $dom->parse('<foo bar="baz">test</foo>');
Parse HTML5/XML document with Mojo::DOM::HTML.
"prepend"
$dom = $dom->prepend('<p>Hi!</p>');
Prepend to element.
# "<div><h1>A</h1><h2>B</h2></div>"
$dom->parse('<div><h2>B</h2></div>')->at('h2')->prepend('<h1>A</h1>');
"prepend_content"
$dom = $dom->prepend_content('<p>Hi!</p>');
Prepend to element content.
# "<div><h2>AB</h2></div>"
$dom->parse('<div><h2>B</h2></div>')->at('h2')->prepend_content('A');
"replace"
$dom = $dom->replace('<div>test</div>');
Replace elements.
# "<div><h2>B</h2></div>"
$dom->parse('<div><h1>A</h1></div>')->at('h1')->replace('<h2>B</h2>');
"replace_content"
$dom = $dom->replace_content('test');
Replace element content.
# "<div><h1>B</h1></div>"
$dom->parse('<div><h1>A</h1></div>')->at('h1')->replace_content('B');
"root"
my $root = $dom->root;
Find root node.
"text"
my $trimmed = $dom->text;
my $untrimmed = $dom->text(0);
Extract text content from element only (not including child elements),
smart whitespace trimming is activated by default. Note that the trim
argument of this method is EXPERIMENTAL and might change without
warning!
"to_xml"
my $xml = $dom->to_xml;
Render DOM to XML.
"tree"
my $tree = $dom->tree;
$dom = $dom->tree(['root', ['text', 'lalala']]);
Document Object Model.
"type"
my $type = $dom->type;
$dom = $dom->type('html');
Element type.
"xml"
my $xml = $dom->xml;
$dom = $dom->xml(1);
Disable HTML5 semantics in parser and activate case sensitivity,
defaults to auto detection based on processing instructions. Note that
this method is EXPERIMENTAL and might change without warning!
SEE ALSO
Mojolicious, Mojolicious::Guides, <http://mojolicio.us>.
perl v5.14.1 2011-09-03 Mojo::DOM(3)