Copyright Notice
This text is copyright by InfoStrada Communications, Inc., and is used with their permission. Further distribution or use is not permitted.This text has appeared in an edited form in Linux Magazine magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.
Please read all the information in the table of contents before using this article.
![]() |
Download this listing! | ![]() |
![]() |
![]() |
Linux Magazine Column 38 (Jul 2002)
[suggested title: Template-driven file management]
I've decided recently to put the stonehenge.com
website under CVS
management. With the CVS tools, I can ``check out'' a current version
of the website sources, play with it a bit, test it on a development
server, and then ``check in'' the changes for deployment on my live
server, the same way the big boys do it. Also, I can let the other
Stonehenge druids edit portions of the site as well, a task that had
formerly been only my job (along with the dozens of other
self-appointed roles I fill at Stonehenge).
Some of the Apache configuration files contain hard-coded pathnames.
I pondered a lot of solutions, starting by spending a good day or two
on rewriting all the config files so that they used only names
relative to the config directory. I got stuck on one directive (for
mod_proxy
's cache configuration) that does not permit a relative
name.
After punting on that idea, it occurred to me to run some or all of the source files through a substitution process that could plug in the pathnames and perhaps a few changable configuration values. Of course, I could be like the 45 other CPAN authors who wrote their own templating system, since that seems to be a rite of passage for a budding Perl hacker. However, I knew that much of my site's new design would also be processed dynamically using Andy Wardley's most excellent Template Toolkit. So, I decided on using Template at ``build time'' as well.
The Template distribution includes a ttree
utility that at first
glance seemed to do what I want: take a tree of files and process them
into a target tree, updating only the files which had changed. But I
needed similar structures to process files that weren't templated, and
also files that were derived from many source files, and that was
outside ttree
's design. So, I stole the important pieces of source
code of ttree
to make my template processing engine, shown in
[listing one, below].
I decided to drive my templating engine from a control file, typically
read from STDIN
, consisting of an output and input filename per
line, separated by whitespace. (Yeah, the first time I have a
filename that has embedded whitespace, I'll be in trouble, and I'll
have to rewrite this bit.) To create the control file, I use a
GNU-style Makefile, presented in [listing two, below]. But first,
let's focus on the templating engine.
Lines 1 through 3 start nearly every Perl program I write, enabling
compile and runtime warnings, restricting the use of barewords, soft
references, and undeclared variables, and ensuring that STDOUT
is
unbuffered.
Lines 5 and 6 pull in the Template
and Getopt::Long
modules,
found in the CPAN.
Lines 8 through 11 process the two command-line options: a flag that provides a pre-processing hook file for Template, and an option to force processing regardless of timestamps. Because my makefile will want to have authority about processing a particular file, I'll use the force option from my makefile. As yet, I haven't used the preprocessing flag.
Lines 13 through 19 set up the Template object, including the
particular configuration options needed for my operation. Relative
pathnames are needed to permit the Makefile to specify filenames below
the current directory (relative to the include path). The preprocess
template is given by the value associated with the option, or
undef
, meaning no preprocess template. And finally, I decided to
use the star
tag style in the ``build phase'' to distinguish it from
the normal Template style to be executed at page delivery time. This
permits template instructions like:
[* IF env.ENABLE_JOKES -*] [% PROCESS stonehenge/sidebar/jokes %] [*- END *]
If the environment variable ENABLE_JOKES is set (while we're building
the site), then the directive is included to process the sidebar at
page delivery time. (The env
hash is set as a Template variable:
we'll see this in a moment.)
Lines 22 to 43 form the main processing loop. To prevent duplicate
consideration of a particular templated file, line 21 defines a
%seen
hash, containing the lines we've processed so far as keys.
Sometimes during my testing, I'd update a templated file, but the
template processing would fail. The next make run would again add
the template to the list of things out of date, and this template
engine would end up seeing the item twice.
Line 25 extracts the output filename and the input filename. Lines 26 and 27 ensure that the input exists, and grab the stat information to use later (for the modification time and permissions and ownership).
Lines 29 to 33 allow the template engine to be a ``mini-make''. Unless
the --force
option is given on the command line, the output file
has to be newer than the input file or else we'll process the file.
You could use the template engine with a static list of
source/destination pairs this way, and the engine would perform
minimal work to update the files. However, since we're letting
make determine out-of-date files, we'll be skipping this code in my
use.
If we make it to line 35, it's time to run the template. The call to
the process
method of the Template object does the job. The middle
parameter defines the predefined variables available to the individual
templates. In this case, we're passing the environment variables as
the name env
. Individual enviroment variable names are available
as env.PATH
or env.SHELL
, and so on. This is the primary means
by which the Makefile can parameterize the templates, including
overriding the values for a particular build.
If the processing fails, line 36 displays that, along with the Template error message. On success, the processing is noted in line 37.
Lines 39 to 42 copy the ownership and permissions from the source file to the destination file. Failures are noted as an advisory, although execution continues.
So that's the template processor. When executed, it looks for lines on standard input like:
/web/stonehenge/etc/httpd.conf etc/httpd.conf.tmpl
to process the conf file from a relative-path-named local source file.
And the httpd.conf.tmpl
file contains mostly constant text, except
for things that need to vary based on the installation directory
or other local parameters, like:
ServerName [* env.SERVERNAME *] Listen [* env.LISTEN_AT *]
DocumentRoot [* env.PREFIX *]/htdocs PIDFile [* env.PREFIX *]/var/run/httpd.pid ScoreBoardFile [* env.PREFIX *]/var/run/httpd.scoreboard LockFile [* env.PREFIX *]/var/run/httpd.lock
<Directory [* env.PREFIX *]/htdocs> .... </Directory>
And these are replaced enroute to the httpd.conf
file. Like magic.
Additionally, repetitive items or conditional items can be captured as
Template blocks or macros. (I'm just starting to scratch the surface
of this now. Perhaps I'll cover that in greater depth in a future
column.)
Of course, this doesn't make sense without the Makefile depositing the right items into that control file, or making the other directories and copying the other files over. So, let's take a look at how that's done.
The trickiest part of the Makefile design was ensuring that the
template engine would get run once at the end of the pass. In BSD
Make, this can be achieved with a .END
target, but GNU Make didn't
have such a feature. With the help of fellow Perl hacker Uri Guttman,
we came up with a weird hack that's rather cool once you get your head
around it.
Nearly the entire Makefile is split into two pieces. On a normal
invocation, only the first few lines (from line 8 to line 12) are
executed, recursively invoking make on the same Makefile looking to
build the same targets, but adding the FINAL
target as well. And
to note that we're the recursed version, an additional variable is
added (RECURSED
). And that then skips over lines 8 through 12
(thanks to the conditional on line 7), and we run the rest of the
file.
There's almost certainly a more clever way of doing this, but I couldn't find it after half a day of searching the net and asking my friends, until Uri stumbled through something close to this solution.
Lines 17 to 21 define the configuration parameters used by the templates, and by the Makefile itself.
PREFIX
is the execution top-level directory. INSTALLPREFIX
is
usually the same as PREFIX
, except when you want to tar up the
files for an RPM or other distribution bundler, or want to ``stage in''
the live data. For example, if your live site is running off
/web/stonehenge
, you can build and install a new website from
scratch with a minimum of downtime with:
$ make INSTALLPREFIX=/web/NEW $ /web/stonehenge/sbin/apachectl stop $ mv /web/stonehenge /web/stonehenge.OLD $ mv /web/NEW /web/stonehenge $ /web/stonehenge/sbin/apachectl start
By making INSTALLPREFIX
separate from PREFIX
, we can stage the
files into that temporary directory for the fast switch.
APACHE_PREFIX
defines the prefix
that Apache was built with.
The etc
and sbin
directory would be immediately below this, for
example.
SERVERNAME
and LISTEN_AT
define the server information. Again,
the point of this configuration setup is to be able to run a
development version of the server at a different location, perhaps
even on a different box, so these must be configurable.
Lines 25 and 26 define variables that should not be overriden from the
command line. In particular, note the use of I
, which permits
$I
to be written in rules easily.
Line 30 defines a macro to crawl through a given subdirectory, looking
for any files (that aren't Emacs editor backups), and returns their
equivalent names in the INSTALLPREFIX
hierarchy. An optional
.tmpl
is also automatically removed. This macro is used in the
various rules to avoid explictly naming all the files in the
directories.
Lines 32 to 52 define the rules for building each of the
subdirectories, including a group install target to build just a
portion of the data. There's a lot of repetition and repetition here,
but I couldn't find a way to reduce that. Note that the pattern is
similar: the top-level install
target depends on a particular
install-foo
target. A similarly-named variable is loaded up by
calling the macro defined earlier, and then the install-foo
target
is made to depend on those filenames.
But where do the rules get selected to either copy those files or run
them through the templating engine? Ahh, that's the magic down in
lines 54 to 62. If a file wanted under the INSTALLPREFIX
directory
has a corresponding file relative to the local current directory, then
we simply copy it over (if it's out of date), after first making its
parent directory if needed.
However, if the file wanted in the INSTALLPREFIX
directory has a
corresponding .tmpl
file, then we note that for the template engine
to process, by writing the destination and source into the
run-template.in
file. Note that all template files are also
dependent on the templater itself, and the GNUmakefile. That way,
edits to either of these files cause the templates to be re-run.
So the typical install
step copies a bunch of text files directly
from the source directories to the destination, and notes the template
files that also have to be processed. But where do the templates get
processed? Recall that the recursive invocation also wants FINAL
to be built, after building the designated targets. Ahh, lines 64 to
72 define the rules for that. First, FINAL depends on
run-template.out
, so we need to bring that up to date. It's up to
date only when newer than run-template.in
. But if it doesn't
exist, or it's not newer, we'll run the commands in lines 69 to 71.
The templater processes the control file (scribbled into by line 62),
then the control file is emptied out (line 70), and the output file is
then touched (in line 71) to make it newer than the input. If for
some reason, the input file never got created, an empty one is created
in line 72. I'm not sure if I still need this step: but it certainly
didn't hurt to leave it in.
And that's it. In these two core structures, you've got the means to build a hierarchy of files, some of which are run through a templating engine, with a minimal amount of copying around as you edit stuff. And that's the guts of my new web-site building engine. Until next time, enjoy!
Listings
=0= #################### LISTING ONE #################### =1= #!/usr/bin/perl -w =2= use strict; =3= $|++; =4= =5= use Template; =6= use Getopt::Long; =7= =8= GetOptions( =9= 'preprocess' => \ (my $preprocess), =10= 'force!' => \ (my $force = 0), =11= ) or die "see code for usage\n"; =12= =13= my $t = Template->new =14= ({ =15= RELATIVE => 1, =16= PRE_PROCESS => $preprocess, =17= INCLUDE_PATH => ['.'], =18= TAG_STYLE => 'star', =19= }); =20= =21= my %seen; =22= while (<>) { =23= next if $seen{$_}++; =24= =25= my($outname, $inname) = split; =26= my @instat = stat($inname) or =27= print(" - $inname (can't stat)\n"), next; =28= =29= unless ($force) { =30= my @outstat = stat($outname); =31= @outstat and $outstat[9] > $instat[9] and =32= print(" - $inname (not newer)\n"), next; =33= } =34= =35= $t->process($inname, {env => \%ENV}, $outname) or =36= print(" ! ", $t->error(), "\n"), next; =37= print(" + $inname => $outname\n"); =38= =39= chown $instat[4], $instat[5], $outname =40= or warn "Cannot chown @instat[4,5] $outname: $!"; =41= chmod $instat[2], $outname =42= or warn "Cannot chmod $instat[2] $outname: $!"; =43= } =0= #################### LISTING TWO #################### =1= ### mandatory =2= SHELL = /bin/sh =3= .SUFFIXES: =4= =5= ### ensure FINAL =6= =7= ifndef RECURSED =8= MAKECMDGOALS ?= install =9= =10= $(MAKECMDGOALS): =11= @$(MAKE) --no-print-directory RECURSED=1 $(MAKECMDGOALS) FINAL =12= =13= else # endif is at end of file =14= =15= ### external configuration variables (from env or make-line) =16= =17= export PREFIX ?= /web/stonehenge =18= export INSTALLPREFIX ?= $(PREFIX) =19= export APACHE_PREFIX ?= /opt/apache/1.3.23 =20= export SERVERNAME ?= www.stonehenge.com =21= export LISTEN_AT ?= www.stonehenge.com:80 =22= =23= ### internal variables (should require no change) =24= =25= I = $(INSTALLPREFIX) =26= TEMPLATER = ./run-template =27= =28= ### macros =29= =30= get_installs_from_subdir = $(patsubst %,$I/%,$(patsubst %.tmpl,%,$(shell find $1 -type f ! -name '*~' -print))) =31= =32= ### subdirectories =33= =34= ## etc =35= install: install-etc =36= install_etc_files := $(call get_installs_from_subdir, etc) =37= install-etc: $(install_etc_files) =38= =39= ## htdocs =40= install: install-htdocs =41= install_htdocs_files := $(call get_installs_from_subdir, htdocs) =42= install-htdocs: $(install_htdocs_files) =43= =44= ## sbin =45= install: install-sbin =46= install_sbin_files := $(call get_installs_from_subdir, sbin) =47= install-sbin: $(install_sbin_files) =48= =49= ## var =50= install: install-var =51= install_var_files := $(call get_installs_from_subdir, var) =52= install-var: $(install_var_files) =53= =54= ### pattern rules =55= =56= $I/%: % =57= mkdir -p $(dir $@) =58= cp $< $@ =59= =60= $I/%: %.tmpl $(TEMPLATER) GNUmakefile =61= @echo want: $< '=>' $@ =62= @echo $@ $< >>$(TEMPLATER).in =63= =64= ### handle FINAL step =65= =66= FINAL: $(TEMPLATER).out =67= =68= $(TEMPLATER).out: $(TEMPLATER).in =69= $(TEMPLATER) --force $< =70= -@cp /dev/null $< =71= -@touch $@ =72= $(TEMPLATER).in:; touch $@ =73= =74= endif # matches ifdef/else at top of file