Here's small sample of some of the non-OO ways you can use this module:
use HTML::Stream qw(:funcs);
print html_tag('A', HREF=>$link);
print html_escape("<<Hello & welcome!>>");
And some of the OO ways as well:
use HTML::Stream;
$HTML = new HTML::Stream \*STDOUT;
# The vanilla interface...
$HTML->tag('A', HREF=>"$href");
$HTML->tag('IMG', SRC=>"logo.gif", ALT=>"LOGO");
$HTML->text($copyright);
$HTML->tag('_A');
# The chocolate interface...
$HTML -> A(HREF=>"$href");
$HTML -> IMG(SRC=>"logo.gif", ALT=>"LOGO");
$HTML -> t($caption);
$HTML -> _A;
# The chocolate interface, with whipped cream...
$HTML -> A(HREF=>"$href")
-> IMG(SRC=>"logo.gif", ALT=>"LOGO")
-> t($caption)
-> _A;
# The strawberry interface...
output $HTML [A, HREF=>"$href"],
[IMG, SRC=>"logo.gif", ALT=>"LOGO"],
$caption,
[_A];
There's even a small built-in subclass, HTML::Stream::Latin1 , which can handle Latin-1 input right out of the box. But all in good time...
use HTML::Stream qw(:funcs); # imports functions from @EXPORT_OK
print html_tag(A, HREF=>$url);
print '© 1996 by', html_escape($myname), '!';
print html_tag('/A');
By the way: that last line could be rewritten as:
print html_tag(_A);
And if you need to get a parameter in your tag that doesn't have an associated value, supply the undefined value (not the empty string!):
print html_tag(TD, NOWRAP=>undef, ALIGN=>'LEFT');
<TD NOWRAP ALIGN=LEFT>
print html_tag(IMG, SRC=>'logo.gif', ALT=>'');
<IMG SRC="logo.gif" ALT="">
There are also some routines for reversing the process, like:
$text = "This <i>isn't</i> "fun"...";
print html_unmarkup($text);
This isn't "fun"...
print html_unescape($text);
This isn't "fun"...
Yeah, yeah, yeah , I hear you cry. We've seen this stuff before. But wait! There's more...
use HTML::Stream;
$HTML = new HTML::Stream \*STDOUT;
$HTML->tag(A, HREF=>$url);
$HTML->ent('copy');
$HTML->text(" 1996 by $myname!");
$HTML->tag(_A);
As you've probably guessed:
text() Outputs some text, which will be HTML-escaped.
tag() Outputs an ordinary tag, like <A>, possibly with parameters.
The parameters will all be HTML-escaped automatically.
ent() Outputs an HTML entity, like the © or < .
You mostly don't need to use it; you can often just put the
Latin-1 representation of the character in the text().
You might prefer to use t() and e() instead of text()
and ent(): they're absolutely identical, and easier to type:
$HTML -> tag(A, HREF=>$url);
$HTML -> e('copy');
$HTML -> t(" 1996 by $myname!");
$HTML -> tag(_A);
Now, it wouldn't be nice to give you those text() and ent() shortcuts without giving you one for tag(), would it? Of course not...
$HTML -> A(HREF=>$url);
$HTML -> e('copy');
$HTML -> t(" 1996 by $myname!");
$HTML -> _A;
As you've probably guessed:
A(HREF=>$url) == tag(A, HREF=>$url) == <A HREF="/the/url">
_A == tag(_A) == </A>
All of the autoloaded ``tag-methods'' use the tagname in all-uppercase
. A "_" prefix on any tag-method means that an end-tag is desired. The "_" was chosen for several reasons: (1) it's short and easy to type, (2) it
doesn't produce much visual clutter to look at, (3) _TAG looks a little like /TAG because of the straight line.
$HTML -> IMGG(SRC=>$src);
(You're not yet protected from illegal tag parameters, but it's a start, ain't it?)
If you need to make a tag known (sorry, but this is currently a global operation, and not stream-specific), do this:
accept_tag HTML::Stream 'MARQUEE'; # for you MSIE fans...
Note: there is no corresponding "reject_tag".
I thought and thought about it, and could not convince myself that such a
method would do anything more useful than cause other people's modules to
suddenly stop working because some bozo function decided to reject the FONT tag.
$HTML -> A(HREF=>$url)
-> e('copy') -> t(" 1996 by $myname!")
-> _A;
But wait! Neapolitan ice cream has one more flavor...
p(), a(), etc. (especially when markup-functions
like tr() conflict with existing Perl functions). So I came up
with this:
output $HTML [A, HREF=>$url], "Here's my $caption", [_A];
Conceptually, arrayrefs are sent to html_tag(), and strings to
html_escape().
$HTML -> HTML
-> HEAD
-> TITLE -> t("Hello!") -> _TITLE
-> _HEAD
-> BODY(BGCOLOR=>'#808080');
Actually produces this:
<HTML><HTML>
<HEAD>
<TITLE>Hello!</TITLE>
</HEAD>
<BODY BGCOLOR="#808080">
To turn off autoformatting altogether
on a given HTML::Stream object, use the auto_format() method:
$HTML->auto_format(0); # stop autoformatting!
To change whether a newline is automatically output
before/after the begin/end form of a tag at a global
level, use set_tag():
HTML::Stream->set_tag('B', Newlines=>15); # 15 means "\n<B>\n \n</B>\n"
HTML::Stream->set_tag('I', Newlines=>7); # 7 means "\n<I>\n \n</I> "
To change whether a newline is automatically output
before/after the begin/end form of a tag for a given stream
level, give the stream its own private ``tag info'' table, and then use set_tag():
$HTML->private_tags;
$HTML->set_tag('B', Newlines=>0); # won't affect anyone else!
To output newlines explicitly
, just use the special nl method in the Chocolate Interface:
$HTML->nl; # one newline
$HTML->nl(6); # six newlines
I am sometimes asked, ``why don't you put more newlines in automatically?'' Well, mostly because...
PRE environment.
ent() (or e()) method to output an entity:
$HTML->t('Copyright ')->e('copy')->t(' 1996 by Me!');
But this can be a pain, particularly for generating output with non-ASCII characters:
$HTML -> t('Copyright ')
-> e('copy')
-> t(' 1996 by Fran') -> e('ccedil') -> t('ois, Inc.!');
Granted, Europeans can always type the 8-bit characters directly in their Perl code, and just have this:
$HTML -> t("Copyright \251 1996 by Fran\347ois, Inc.!');
But folks without 8-bit text editors can find this kind of output cumbersome to generate. Sooooooooo...
The default ``auto-escape'' behavior of an HTML stream can be a drag if
you've got a lot character entities that you want to output, or if you're
using the Latin-1 character set, or some other input encoding. Fortunately,
you can use the auto_escape() method to change the way a particular HTML::Stream works at any time.
First, here's a couple of special invocations:
$HTML->auto_escape('ALL'); # Default; escapes [<>"&] and 8-bit chars.
$HTML->auto_escape('LATIN_1'); # Like ALL, but uses Latin-1 entities
# instead of decimal equivalents.
$HTML->auto_escape('NON_ENT'); # Like ALL, but leaves "&" alone.
You can also install your own auto-escape function (note that you might very well want to install it for just a little bit only, and then de-install it):
sub my_auto_escape {
my $text = shift;
HTML::Entities::encode($text); # start with default
$text =~ s/\(c\)/©/ig; # (C) becomes copyright
$text =~ s/\\,(c)/\&$1cedil;/ig; # \,c becomes a cedilla
$text;
}
# Start using my auto-escape:
my $old_esc = $HTML->auto_escape(\&my_auto_escape);
# Output some stuff:
$HTML-> IMG(SRC=>'logo.gif', ALT=>'Fran\,cois, Inc');
output $HTML 'Copyright (C) 1996 by Fran\,cois, Inc.!';
# Stop using my auto-escape:
$HTML->auto_escape($old_esc);
If you find yourself in a situation where you're doing this a lot, a better way is to create a subclass of HTML::Stream which installs your custom function when constructed. For an example, see the HTML::Stream::Latin1 subclass in this module.
new() with a filehandle: any object that responds to a print() method will do
. Of course, this includes blessed
FileHandles, and IO::Handles.
If you supply a GLOB reference (like \*STDOUT) or a string (like
"Module::FH"), HTML::Stream will automatically create an invisible object for talking
to that filehandle (I don't dare bless it into a FileHandle, since the
underlying descriptor would get closed when the HTML::Stream is destroyed,
and you might not want that).
You say you want to print to a string? For kicks and giggles, try this:
package StringHandle;
sub new {
my $self = '';
bless \$self, shift;
}
sub print {
my $self = shift;
$$self .= join('', @_);
}
package main;
use HTML::Stream;
my $SH = new StringHandle;
my $HTML = new HTML::Stream $SH;
$HTML -> H1 -> "<Hello & welcome!>" -> _H1;
print "PRINTED STRING: ", $$SH, "\n";
package MY::HTML;
@ISA = qw(HTML::Stream);
sub Aside {
$_[0] -> FONT(SIZE=>-1) -> I;
}
sub _Aside {
$_[0] -> _I -> _FONT;
}
Now, you can do this:
my $HTML = new MY::HTML \*STDOUT;
$HTML -> Aside
-> t("Don't drink the milk, it's spoiled... pass it on...")
-> _Aside;
If you're defining these markup-like, chocolate-interface-style functions, I recommend using mixed case with a leading capital. You probably shouldn't use all-uppercase, since that's what this module uses for real HTML tags.
< > = &
Note: provided for convenience and backwards-compatibility only. You may want to use the more-powerful HTML::Entities::encode function instead.
For convenience and readability, you can say _A instead of "/A"
for the first tag, if you're into barewords.
lt, gt, amp, quot, and #ddd) into ASCII characters.
Note:
provided for convenience and backwards-compatibility only. You may want to
use the more-powerful HTML::Entities::decode
function instead: unlike this function, it can collapse entities like copy and ccedil into their Latin-1 byte values.
The PRINTABLE may be a FileHandle, a glob reference, or any object that
responds to a print() message. If no PRINTABLE is given, does a select() and uses
that.
If the argument is a subroutine reference SUBREF, then that subroutine will be used. Declare such subroutines like this:
sub my_escape {
my $text = shift; # it's passed in the first argument
...
$text;
}
If a textual NAME is given, then one of the appropriate built-in functions is used. Possible values are:
#123).
ccedil) instead of decimal entity codes to escape characters. This makes the HTML
more readable but it is currently not advised, as ``older'' browsers (like
Netscape 2.0) do not recognize many of the ISO-8859-1 entity names (like deg).
Warning: If you specify this option, you'll find that it attempts to ``require'' HTML::Entities at run time. That's because I didn't want to force you to have that module just to use the rest of HTML::Stream. To pick up problems at compile time, you are advised to say:
use HTML::Stream;
use HTML::Entities;
in your source code.
output $HTML "If A is an acute angle, then A > 90°";
select(). No arguments just returns the currently-installed function.
Please use no other values; they are reserved for future use.
$html->ent('nbsp');
You may abbreviate this method name as e:
$html->e('nbsp');
Warning: this function assumes that the entity argument is legal.
print() message:
$HTML->io->print("This is not auto-escaped or nuthin!");
_A instead of "/A", if you're into barewords.
t:
$html->t('Hi there, ', $yournamehere, '!');
html_tag() and output the result. If an item is a
text string, escape the text and output the result. Like this:
output $HTML [A, HREF=>$url], "Here's my $caption!", [_A];
# Make sure methods MARQUEE and _MARQUEE are compiled on demand:
HTML::Stream->accept_tag('MARQUEE');
...gives the Chocolate Interface permission to create (via AUTOLOAD) definitions for the MARQUEE and _MARQUEE methods, so you can then say:
$HTML -> MARQUEE -> t("Hi!") -> _MARQUEE;
If you want to set the default attribute of the tag as well, you can do so
via the set_tag() method instead; it will effectively do an
accept_tag() as well.
# Make sure methods MARQUEE and _MARQUEE are compiled on demand,
# *and*, set the characteristics of that tag.
HTML::Stream->set_tag('MARQUEE', Newlines=>9);
set_tag will affect everyone.
However, if you want an HTML stream to have a private copy of that table to munge with, just send it this message after creating it. Like this:
my $HTML = new HTML::Stream \*STDOUT;
$HTML->private_tags;
Then, you can say stuff like:
$HTML->set_tag('PRE', Newlines=>0);
$HTML->set_tag('BLINK', Newlines=>9);
And it won't affect anyone else's auto-formatting
(although they will possibly be able to use the BLINK tag method without a
fatal exception :-( ).
Returns the self object.
HTML::Stream->set_tag('MARQUEE', Newlines=>9);
Once you do this, all HTML streams you open from then on will allow that tag to be output in the chocolate interface.
Warning:
by default, an HTML stream just references the ``master tag table'' (this
makes new() more efficient), so by default, the
instance method will behave exactly like the class method.
my $HTML = new HTML::Stream \*STDOUT;
$HTML->set_tag('BLINK', Newlines=>0); # changes it for others!
If you want to diddle with one stream's auto-formatting only, you'll need to give that stream its own private tag table. Like this:
my $HTML = new HTML::Stream \*STDOUT;
$HTML->private_tags;
$HTML->set_tag('BLINK', Newlines=>0); # doesn't affect other streams
Note: this will still force an default entry for BLINK in the master tag table: otherwise, we'd never know that it was legal to AUTOLOAD a BLINK method. However, it will only alter the characteristics of the BLINK tag (like auto-formatting) in the object's tag table.
0x01 newline before <TAG> .<TAG>. .</TAG>.
0x02 newline after <TAG> | | | |
0x04 newline before </TAG> 1 2 4 8
0x08 newline after </TAG>
Hence, to output BLINK environments which are preceded/followed by newlines:
set_tag HTML::Stream 'BLINK', Newlines=>9;
set_tag for class/instance method differences).
ç) for ISO-8859-1 characters.
So using HTML::Stream::Latin1 like this:
use HTML::Stream;
$HTML = new HTML::Stream::Latin1 \*STDOUT;
output $HTML "\253A right angle is 90\260, \277No?\273\n";
Prints this:
«A right angle is 90°, ¿No?»
Instead of what HTML::Stream would print, which is this:
«A right angle is 90°, ¿No?»
Warning: a lot of Latin-1 HTML markup is not recognized by older browsers (e.g., Netscape 2.0). Consider using HTML::Stream; it will output the decimal entities which currently seem to be more ``portable''.
Note: using this class ``requires'' that you have HTML::Entities.
output() method and the various
``tag'' methods seem to run about 5 times slower than the old
just-hardcode-the-darn stuff approach. That is, in general, this:
### Approach #1...
tag $HTML 'A', HREF=>"$href";
tag $HTML 'IMG', SRC=>"logo.gif", ALT=>"LOGO";
text $HTML $caption;
tag $HTML '_A';
text $HTML $a_lot_of_text;
And this:
### Approach #2...
output $HTML [A, HREF=>"$href"],
[IMG, SRC=>"logo.gif", ALT=>"LOGO"],
$caption,
[_A];
output $HTML $a_lot_of_text;
And this:
### Approach #3...
$HTML -> A(HREF=>"$href")
-> IMG(SRC=>"logo.gif", ALT=>"LOGO")
-> t($caption)
-> _A
-> t($a_lot_of_text);
Each run about 5x slower than this:
### Approach #4...
print '<A HREF="', html_escape($href), '>',
'<IMG SRC="logo.gif" ALT="LOGO">',
html_escape($caption),
'</A>';
print html_escape($a_lot_of_text);
Of course, I'd much rather use any of first three (especially #3)
if I had to get something done right in a hurry. Or did you not notice the
typo in approach #4? ;-)
(BTW, thanks to Benchmark:: for allowing me to... er... benchmark stuff.)
Added built-in support for escaping 8-bit characters.
Added LATIN_1 auto-escape, which uses HTML::Entities to generate mnemonic entities. This is now the default method for HTML::Stream::Latin1.
Added auto_format(),
so you can now turn auto-formatting off/on.
Added private_tags, so it is now possible for HTML streams to each have their own ``private''
copy of the %Tags table, for use by set_tag().
Added set_tag(). The tags tables may now be modified dynamically so as to change how
formatting is done on-the-fly. This will hopefully not compromise the
efficiency of the chocolate interface (until now, the formatting was
compiled into the method itself), and will
add greater flexibility for more-complex programs.
Added POD documentation for all subroutines in the public interface.
comment().
Thanks to John D Groenveld for the suggestion and the patch.
Fixed bug in accept_tag(), where 'my' variable was shadowing
argument.
Thanks to John D Groenveld for the bug report and the patch.
John Buckman For suggesting that I write an "html2perlstream",
and inspiring me to look at supporting Latin-1.
Tony Cebzanov For suggesting that I write an "html2perlstream"
John D Groenveld Bug reports, patches, and suggestions
B. K. Oxley (binkley) For suggesting the support of "writing to strings"
which became the "printable" interface.
Enjoy.