NAME HTML::HTML5::Sanity - Perl extension to make HTML5 DOM trees less insane. SYNOPSIS use HTML::HTML5::Parser; use HTML::HTML5::Sanity; my $parser = HTML::HTML5::Parser->new; my $html5_dom = $parser->parse_file('http://example.com/'); my $sane_dom = fix_document($html5_dom); print document_to_clarkml($sane_dom); DESCRIPTION The Document Object Model (DOM) generated by HTML::HTML5::Parser meets the requirements of the HTML5 spec, but will probably catch a lot of people by surprise. The main oddity is that elements and attributes which appear to be namespaced are not really. For example, the following element:
...
Looks like it should be parsed so that it has an attribute "lang" in the XML namespace. Not so. It will really be parsed as having the attribute "xml:lang" in the null namespace. "fix_document" $sane_dom = fix_document($html5_dom); Returns a modified copy of the DOM and leaving the original DOM unmodified. "document_to_clarkml", "element_to_clarkml", "attribute_to_clarkml", $string = document_to_clarkml($document); $string = element_to_clarkml($element); $string = attribute_to_clarkml($attribute); Returns a Clark-Notation-like string useful for debugging. Only the first function, which takes an XML::LibXML::Document is exported by default, but by choosing an export list of ":all" or ":debug" will export the others too. "document_to_hashref", "element_to_hashref", "attribute_to_hashref", $data = document_to_hashref($document); $data = element_to_hashref($element); $data = attribute_to_hashref($attribute); Returns a hashref useful for debugging. Only the first function, which takes an XML::LibXML::Document is exported by default, but by choosing an export list of ":all" or ":debug" will export the others too. BUGS Please report any bugs to . SEE ALSO HTML::HTML5::Parser, XML::LibXML. AUTHOR Toby Inkster . COPYRIGHT AND LICENSE Copyright (C) 2009 by Toby Inkster This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8 or, at your option, any later version of Perl 5 you may have available.