NAME HTML::EP - a system for embedding Perl into HTML SYNOPSIS CGI-Env This is an HTML document. You see. Perhaps you wonder about the unknown HTML tags like ep-comment above? They are part of the EP system. For example, this comment section will be removed and you won't see it in your browser. # This is an example of embedding Perl into the page. # We create a variable called time, containing the current # time. This variable will be used below. my $self = $_; $self->{'time'} = localtime(time()); ''; # Return an empty string; result becomes embedded into the # HTML page

The current time

Your HTML::EP system is up and running: The current time is $time$. WARNING THIS IS ALPHA SOFTWARE. It is *only* 'Alpha' because the interface (API) is not finalised. The Alpha status does not reflect code quality or stability. In particular the following things might change without further notice: The C interface of introducing own methods. This depends on whether I need changes for inserting the module into mod_perl. See the section on "TODO" below. DESCRIPTION Have you ever written a CGI binary? Easy thing, isn't it? Was just fun! Have you written two CGI binaries? Even easier, but not so much fun. How about the third, fourth or fifth tool? Sometimes you notice that you are always doing the same: Reading and parsing variables Formatting output, in particular building tables Sending mail out from the page Building a database connection, passing CGI input to the database and vice versa Talking to HTML designers about realizing their wishes You see, it's soon to become a pain. Of course there are little helpers around, for example the CGI module, the mod_perl suite and lots of it more. Using them make live a lot easier, but not so much as you like. the CGI(3) manpage. the mod_perl(3) manpage. On the other hand, there are tools like PHP/FI or WebHTML. Incredibly easy to use, but not as powerfull as Perl. Why not get the best from both worlds? This is what EP wants to give you, similar to ePerl or HTML::EmbPerl. I personally believe that EP is simpler and better extendible than the latter two. the ePerl(1) manpage. the HTML::EmbPerl(3) manpage. In short, it's a single, but extensible program, that scans an HTML document for certain special HTML tags. These tags are replaced by appropriate output generated by the EP. What remains is passed to the browser. Its just like writing HTML for an enhanced browser! Prerequisites As far as I know EP depends on no system dependent features. However, it relies on some other Perl modules: CGI The CGI module should be a part of your Perl core's installation. If not, you should definitely upgrade to Perl 5.004. :-) My thanks to Lincoln D. Stein . HTML::Parser This module is used for parsing the HTML templates. My thanks to Gisle Aas . libwww The LWP library contains a lot of utility functions, for example HTML and URL encoding and decoding. Again, my thanks to Gisle Aas . :-) Perl itself and the above modules are available from any CPAN mirror, for example ftp://ftp.funet.fi/pub/languages/perl/CPAN/modules/by-module Installation Installing this module (and the prerequisites from above) is quite simple. You just fetch the archive, extract it with gzip -cd HTML-EP-0.1000.tar.gz | tar xf - (this is for Unix users, Windows users would prefer WinZip or something similar) and then enter the following: cd HTML-EP-0.1000 perl Makefile.PL make make test If any tests fail, let me know. Otherwise go on with make install This will put the required Perl modules into a destination where Perl finds it by default. Additionally it will install a single CGI binary, called `ep.cgi'. The docs are available online with perldoc HTML::EP If you prefer an HTML version of the docs, try pod2html lib/HTML/EP.pm in the source directory. Using the CGI binary You have different options for integrating EP into your WWW server, depending on which server you are using and the permissions you have. The simplest possibility is running an external CGI binary. Another option is to use mod_perl with Apache, see the section on "Using mod_perl" below. I suggest that you choose an extension and configure your WWW server for feeding files with this extension into `ep.cgi'. For example, with Apache, you can add the following lines to your `srm.conf': ScriptAlias /cgi-bin/ep.cgi /usr/bin/ep.cgi AddHandler x-ep-script .ep Action x-ep-script /cgi-bin/ep.cgi This tells Apache that files with extension ep.cgi are handled by the CGI binary `/usr/bin/ep.cgi'. Make sure, that the ScriptAlias line is entered *before* any other ScriptAlias instruction! From now on your server will never return files with extension .ep directly! Verify your installation by creating the following file: Local time</time></head> <body> The current time is: <ep-perl>scalar(localtime(time))</ep-perl> </body> (Note that this is a much shorter version of the example in the synopsis.) Store it as `/test.ep' on your web server and retrieve the file via your Web server. If you see the time displayed, it are up and running. Using mod_perl The EP package can be integrated into mod_perl, for example by using the following commands in `srm.conf': <Files *.ep> SetHandler perl-script PerlHandler Apache::EP->handler Options ExecCGI </Files> Keep in mind, that mod_perl differs in many details from programming CGI binaries. In particular you might need to restart Apache for loading changes in modules. Available methods All EP tags are starting with the prefix *ep-*. Some available tags are: ep-comment This is a multi-line tag for embedding comments into your HTML page. But why use this tag, instead of the usual HTML comment, `<!--'? The difference is, that the user will never see the former. Example: <html> <!-- This is a comment. I like comments. --!> <ep-comment> This is another comment, but you won't see it in your browser. The HTML editor will show it to you, however! </ep-comment> </html> Do not try to embed EP instructions into the comment section! They won't produce output, but they might be executed anyways. ep-perl This is for embedding Perl into your script. There are two versions of it: A multiline version is for embedding the Perl code immediately into your script. Example: <html> <head><title>The Date

The Date

Hello, today its the

# This little piece of Perl code will be executed # while scanning the page. # # Let's calculate the date! # my($sec,$min,$hour,$mday,$mon,$year) = localtime(time); # Leave a string with the date as result. Will be # inserted into the HTML stream: sprintf("%02d.%02d.%04d", $mday, $mon+1, $year+1900);

If you don't like to embed Perl code, you may store it into a different file. That's what the single-line version of ep-perl is for: The Date

The Date

Hello, today its the

You have noticed, that the little script's result was inserted into the HTML page, did you? It did return a date, in other words a string consisting of letters, digits and dots. There's no problem with inserting such a string into an HTML stream. But that's not always the case! Say you have a string like Use for terminating the HTML page. This cannot be inserted as a raw string, for obvious reasons. Thus the ep-perl command has an attribute *output*. Use it like this: 'Use for terminating the HTML page.'; Possible values of the *output* attribute are `raw' (default), `html' (HTML encoded) and `url' (URL encoded). It's a common mistake, to use the Perl command `return' in embedded Perl. Never do that! If you need return (there are of course situations where returning can help), do it like this: sub eval_me { if ($this) { return 'foo'; } elsif ($that) { return 'bar'; } ''; } eval_me(); See the section on "Variables" below for interactions between Perl variables and EP variables. For security reasons, you might set an attribute *safe*, as in ... This will create a Safe compartment for you and run the embedded script in the compartment. Using this attribute is highly recommended! ep-mail This command will send an e-mail. The attributes will be used for creating the email header, in particular the `subject', `from' and `to' attribute should be used. Example: Hello, Bill, old chap. How are you? Yours sincerely, Jochen You can still use EP variables in the E-mail body, for example the following works: Hello, Joe, this e-mail was sent to you by $@cgi->name$. But note that we suppress conversion into HTML format in the mail body! See the section on "Variables" below for details. ep-errhandler This command advices EP, what to do in case of errors. See the section on "Error handling" below. Example: Set the template being used for system errors. Likewise, set the template for user errors. If an error occurs, the given scripts are loaded and used as templates instead of the current one. You don't need external files! Instead you can use User error

User error

Replace user and continue. :-)

To be serious, the following problem happened:

$errmsg$

Please return to the calling page, fix the problem and retry.

However, you might prefer to use a single error template and of course it's faster to use external error templates than parsing builtin templates. (At least, if no error occurs. :-) ep-error This command forces an error message. See the section on "Error handling" below. You can trigger user or system errors by setting the *type* attribute to the values `system' (default) or `user'. The *msg* attribute is for setting the error message. Example: If no email address was entered, force a user error. ep-database This command connects to a database. Its attributes are `dsn', `user' and `password' corresponding to the same attributes of the DBI connect method. See the DBI(3) manpage for details on DBI. Example: You can use different database connections by using the *dbh* attribute: The *dbh* attribute advices EP to store the DBI handle in the given variable. (Default: `dbh') See the section on "Variables" below. ep-query This command executes an SQL statement. The `query' attribute will be used for passing the SQL statement. Of course a multiline version is available, thus is the same as INSERT INTO foo VALUES (1, 'bar') If your query retrieves a result, use the `result' attribute to store it in a variable, for example like this: This will create a variable `employees', an array ref of hash refs. You can use the ep-list command for displaying the output. See the section on "Variables" below. When using multiple database connections, use the *dbh* attribute for choosing the connection. (See the *ep-database* method above.) If you have big result tables, you might prefer DBI's *fetchrow_arrayref* method over creating hash refs, because arrays are created faster than hash refs. This is achieved by setting the attribute *resulttype* to array. The default is hash. Sometimes you don't want to retrieve the complete result table. In that case you can use the attributes *startat* and *limit*. For example, to retrieve rows 0-19, use startat=0 and *limit=20*. Likewise you would use startat=20 and limit=20 for rows 20-39. When using the *MySQL* engine, the *startat* and *limit* attributes are directly mapped to MySQL's *LIMIT* clause. ep-list This command is used to display an array of refs. Lets assume, that the variable `employees' contains a an array ref of refs with the attributes *name* and *department*. Then you could create a table of employees as follows:
Nr.NameDepartment
$i$$e->name$$e->department$
This will be processed as follows: For any item in the array, retrieved from the variable `employees', create a variable `e' and display the text between ep-list and /ep-list for it by replacing the patterns $e->name$ and $e->department$ with the corresponding values. The variable *i* is initially set to 0 and incremented by one with any element. You must not create a variable for the *items* attribute: When using a *range* attribute instead of *items* then a variable list will be created for you. The *range* attribute can be either `start..stop' in which case a list of the numbers start..stop will be created. Otherwise the attribute must be a comma separated list of values. See the *ep-select* command below for an example of the *range* attribute. ep-select This is similar to ep-list, but it is specifically designed for creating SELECT boxes and similar things. We explain it by example: If you supply a *selected* attribute, then a variable *selected* will be created for any item. The value will be either an empty string or the word `SELECTED' (configurable via the attribute *selected-text*), depending on whether the item matches the *selected* value or not. ep-input This is usefull for reading an objects data out of CGI variables. Say you have a form with input fields describing an address, the field names being address_t_name, address_t_street, address_n_zip and address_t_city. By using the command the EP program will create a variable "address" for you which is an hash ref as follows: $cgi = $_->{cgi}; $_->{address} = { name => { col => 'name', val => $cgi->param("address_name"), type => 't', }, street => { col => 'street', val => $cgi->param("address_street"), type => 't', }, zip => { col => 'zip', val => $cgi->param("address_zip"), type => 'n', }, city => { col => 'city', val => $cgi->param("address_city"), type => 't' } }; In general column names beginning with *address* will be splitted into `prefix_type_suffix', the type being either 't' for text or 'n' for number. The idea is generating SQL queries automatically out of the `address' variable. This task is supported by the *sqlquery* attribute: Create a new record, if no ID is given The *sqlquery* creates attributes *names*, *values* and *update* for you, that may be used in INSERT or UPDATE queries. Note that the *ep-input* must be preceeded by an *ep-database* call, because it is using DBI's *quote* method. the DBI(3) manpage. There are situations where you want to fetch not only a single object, but a list of objects. Suggest an order form of articles. Then you might have input fields *art_0_t_name*, *art_0_n_count*, *art_0_n_price*, *art_1_t_name*, ... In that case you can give the *ep-input* command an attribute list, like this: The module will read an array ref of objects to the variable `dest'. Any object will have an additional scalar variable `i' referring to the items number, beginning with 0. In other words, you can process the order form as follows: my $self = $_; my $sum = 0.0; for (my $i = 0; defined($self->{cgi}->param("art_$i_n_count")); $i++) { $sum += $self->{cgi}->param("art_$i_n_count") * $self->{cgi}->param("art_$i_n_price") } '' The following items have been ordered: Nr. Price Article $art->count->val$ $art->price->val$ $art->name->val$ Total sum: $sum$ ep-include Sometimes you want to source external files. This can be done by using If a file with the given name doesn't exist, the file name is treated as being relative to your WWW servers *DOCUMENT_ROOT* directory. ep-exit This directive terminates processing of the current HTML page. Conditional HTML It is possible to blank out parts of the HTML document. See the following example: Conditional HTML

Conditional HTML

You have entered a negative number for i! You have entered zero for i! You have entered a negative number for j! You have entered zero for j! Ok, both numbers are positive. The example is of course somewhat unnatural, because you'd better use a single ep-perl in that case, but it shows that we can use arbitrary complex structures of conditional HTML. Localization Localization is available via the HTML::EP::Locale module. Currently it only offers methods for localizing strings. To access the module, start your HTML page with When the package is loaded, it tries to guess your documents language. The default language is *de* (german; I ought to make this configurable). You can specify another language (for example *en* for english or *fr* for french) by either supplying a CGI variable `language=xy' or giving your document a secondary extension like `page.xy.ep'. That is, if your document is called mypage.de.ep then the language *de* will be choosen, but *de* will be choosen for mypage.en.ep Two methods are available for localizing strings. For short strings like titles, headers or Link refs you might prefer this version: Obviously this is not appropriate for longer strings and it must not contain HTML patterns. Thus another version is available:

Dies ist ein Absatz.

Dies ist der zweite Absatz.

This is one paragraph.

This is another paragraph.

Error handling Error handling with EP is quite simple: All you do in case of errors is throwing a Perl exception. For example, DBI handles are created with the RaiseError attribute set to 1, so that SQL errors trigger a Perl exception. You never care for errors! However, what happens in case of errors? In that case, EP will use the template that you have set with ep-errhandler and treat it like an ordinary EP document, by setting the variables `errmsg' and `admin'. If you don't set an error handler, the following template will be used, which is well suited for creating an own error template: Internal error

Internal error

An internal error occurred. The server has not been able to fullfill your request. The error message is:

            $errmsg$
        

Please contact the Webmaster, $admin$, tell him the URL, the time and error message.

We apologize for any inconvenience, please try again later!




Yours sincerely,

The Webmaster

Variables It is important to understand, how EP variables work, in particular when working with ep-perl. You always have an object $_, which is an instance of the HTML::EP class (a subclass of HTML::Parser). This object has certain attributes, in particular `$_-'{cgi}>, a CGI object and `$_-'{dbh}>, the DBI handle. (Of course valid after `ep-database' only.) If you want to set or modify a variable, you have to set `$_-'{varname}>. If you want to retrieve the value, use the same. Note that you cannot use `$_' for a long time, as it will be changed by Perl loops and the like, thus your Perl code typically starts with $_ = $self; But how do you access the variable from within EP documents? You just write $varname$ This will be replaced automatically by the parser with the value of `$_- >{varname}'. Even more, the value will be converted into HTML source! If `varname' is a structured variable, for example a hash or array ref, you may as well use $varname->attrname$ or $varname->0$ to access `$_->{varname}->{attrname}' or `$_->{varname}->[0]', respectively. A special value of *varname* is `cgi': This will access the CGI variable of the same name, thus the following are equivalent: $cgi->email$ and $_->{cgi}->param('email'); But what, if you don't want your variable to be HTML encoded? You may as well use $@varname$ (Raw) $#varname$ (URL encoded) $~varname$ (SQL encoded) The latter uses the $_->{dbh}->quote() method. In particular this implies that you have to be connected to a database, before using this tag! You can even use these symbols in attributes of EP commands. For example, the following will be usefull when sending a mail: Attributes may include EP variables, just like ordinary HTML code. Even more, they may contain Perl code which is evaluated just like code between `' and `'. However, you need to use the variable `$_' in the code, because the package otherwise doesn't detect what you want it to do. See the section on "Custom variable formatting" for setting up your own formats. Custom variable formatting Sometimes the builtin formatting methods of HTML::EP are not sufficient. A good example are currencies. Such format methods are stored in the variable *_ep_custom_formats*. As an example we create a method for formatting german currency values: my $self = $_; my $formats = ($self->{'_ep_custom_formats'} ||= {}); $formats->{'DM'} = sub { my($self, $val) = @_; sprintf("%.2f DM", $val); $val =~ s/\./,/; $val; }; '' This can be used as follows: Suggest we have the following variables: a = 1 b = 2.4 c = 34.47 then we can use a = $&DM->a$ => 1,00 DM b = $&DM->b$ => 2,40 DM c = $&DM->c$ => 34,47 DM In other words: Use the special marker &, followed by the custom formats method name, the dereferencing operator and finally the variable name. The above method is already predefined by the HTML::EP::Locale module. Producing Non-HTML Say you want a CGI binary that creates a gif and not an HTML document. (See the `ifgif.ep' file from the SNMP::Monitor distribution for an example.) This can be done in the following way: 1.) Create your own MIME header, for example like this: my $self = $_; $self->print("HTTP/1.0 200 OK\n", "content-type: image/gif\n" "\n"); 2.) Do whatever you want to create your image, for example restore it from a database: my $self = $_; $self->print($self->{im}->{image}); 3.) Finally tell the EP module, that it must not produce any further output by doing a $_->Stop() DEBUGGING Debugging CGI applications is always a problem. The EP module does its best to support you. Whenever you supply a CGI variable *debug*, then the module will enter debugging mode. For example if your document is `/mypage.ep', then tell your browser to fetch `/mypage.ep?debug=1'. You won't see the usual HTML page, but a plain text page with lots of debugging messages and the created HTML source. You may extend the debugging code with sequences like my $self = $_; if ($self->{debug}) { $self->print("I'm here!\n"); } Note that you should not call the *print* function directly, but the *print* method! The former works well in CGI environments, but EP should work even in other environments as well. But sometimes this is not sufficient: What's inserting debugging messages compared to using the Perl debugger? In that case you can emulate a CGI environment as follows: export DOCUMENT_ROOT=/usr/local/www/htdocs export PATH_TRANSLATED=$DOCUMENT_ROOT/mypage.ep export REQUEST_METHOD=GET export QUERY_STRING="var1=val1&var2=val2" perl -d /usr/bin/ep.cgi This allows you single-stepping through your program, displaying variable values and the like. EXTENSIONS It is quite easy to write own methods. Single-line extensions For example, suggest you want a method for accessing environment variables: The idea is to create a variable `e', which is a hash ref of the current environment variables, so that you can use $e->REMOTE_AGENT$ for accessing the name of the users browser. This can be done like this: my $self = $_; # Write a handler for ep-env: sub env ($$) { my($self, $attr) = @_; my $var = $attr->{var}; $self->{$var} = {%ENV}; ''; } # Register the handler in the list of handlers: $self->{_ep_funcs}->{'ep-env'} = { method => 'env' }; # Return an empty string: ''; The *method* attribute of the handler tells the EP module to call $_->env($attr); if the `ep-env' tag is used. The argument `$attr' is a hash ref of the tags attributes. Note the use of the *package* attribute: By default the `ep-perl' code is executed in a *Safe* compartment. See the Safe(3) manpage. Multi-line extensions But how to write methods, that use a `' .. `' syntax? As an example, we write a method for creating external files. The method receives two attributes, a *file* attribute for the files name and a *contents* attribute for the files contents. The method can be used in two ways: or like this, in multiline mode: Hi! Here it is: # Write a handler for ep-file: sub file ($$) { my($self, $attr) = @_; my $contents = $attr->{contents}; if (!defined($contents)) { # Multiline method, no "contents" return undef; # attribute given; return undef } # until we are called again. my $file = $attr->{file}; require Symbol; my $fh = Symbol::gensym(); if (!open($fh, ">$file) || !(print $fh ($contents)) || !close($fh)) { die "Error while creating $file: $!"; } ''; } # Register the handler in the list of handlers # Note the use of the "default" attribute: $self->{_ep_funcs}->{'ep-env'} = { method => 'env', default => 'contents' }; # Return an empty string: ''; In other words: The method gets called twice, once for `' and once for `'. If it thinks, it should enter multi-line mode (if the *contents* attribute is not set, it returns `undef'. In that case EP is looking at the *default* attribute of the handler which is telling, that the lines between `' and `' ought to be written into the *contents* attribute. Thus this attribute exists, if the method is called a second time. Note the use of the *Symbol* package when accessing files: *Never* use global handles like open(FILE, ...) as this might break future multithreading code! Selfloaded methods In the above examples the extension methods have been compiled immediately. This is not always a good idea: For example the `ep-mail' method is loading big external packages like *Mail::Internet* for sending the mail. In such cases you might wish to use HTML::EP's builtin self loader, which is quite similar to that of CGI. We choose `ep-mail' as an example: my $self = $_; # Create a string that can be compiled for loading the method: $AUTOLOADED_SUBS{_ep_mail} = <<'end_of__ep_mail'; require Mail::Internet; sub _ep_mail ($$) { my($self, $attr) = @_; ... } end_of__ep_mail $self->{_ep_funcs}->{ep-mail} = { method => '_ep_mail', default => 'body' }; The advantage is that you have the method available, but the performance penalty of loading it is almost omitted, if the method is not used. Namespace pollution So far extensions are inserted at run time only, usually by loading them from external files. For example you might create extensions for building a WWW shop and put them in, say, `shop.lib'. As soon as it is possible to make extensions permanent (with mod_perl or an HTML::EP server) and extensions will be loaded at startup, there will be more and more of such extension files. Experience shows, that namespace pollution will finally become a problem. For example, any virtual Web server might have a completely different method `ep-buttons' for inserting an automatically generated button frame. Thus I propose to use a package model inherited from Perl's package model, and start using it now: If you create a shop extension, put it into a separate package, `HTML::EP::Shop', say. When working with the extension, use as the very first line, so that your Parser object becomes blessed into the Shop package. And let your shop extension file look like this: use vars qw(@ISA); @ISA = qw(HTML::EP); # Define first extension method ... # Define second extension method ... Note that this allows inheritage! Currently the extension file has to be loaded with but this will change in the future. The *isa* attribute can be omitted and defaults to `HTML::EP'. Otherwise it is a blank separated list of package names, just like @ISA is. PERFORMANCE CONSIDERATIONS The following are merely hints for general Perl programming, but apply in particular to EP programming: Local variables Use local variables instead of hash attributes, as in my $debug; if ($debug = $self->{'debug'}) { ... } if (!$debug) { ... } This is approximately 10-15% faster than the equivalent if ($self->{'debug'}) { ... } if (!$self->{'debug'}) { ... } Of course you win even more with any further use of `$debug'. CHANGES This section describes user visible changes against previous versions. For details and other modifications see the `ChangeLog' file, that is part of the distribution. epparse epperl In previous versions it was not possible to include EP variables or Perl code in attributes of EP commands, unless using a prefix `epparse-' or `epperl-', as in This is no longer the case, because the package now autodetects whether you are using such constructs. (At least it should. :-) The obvious disadvantage is an incompatibility, but the new version is much better readable and surprisingly even (much!) faster, because only hash values are modified and not hash structures. TODO mod_perl support Create an EP server that is accessible via a small C wrapper AUTHOR AND COPYRIGHT This module is Copyright (C) 1998 Jochen Wiedmann Am Eisteich 9 72555 Metzingen Germany Phone: +49 7123 14887 Email: joe@ispsoft.de All rights reserved. You may distribute this module under the terms of either the GNU General Public License or the Artistic License, as specified in the Perl README file. SEE ALSO the DBI(3) manpage, the CGI(3) manpage, the HTML::Parser(3) manpage