Copyright Notice

This text is copyright by CMP Media, LLC, and is used with their permission. Further distribution or use is not permitted.

This text has appeared in an edited form in SysAdmin/PerformanceComputing/UnixReview magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.

Please read all the information in the table of contents before using this article.

Unix Review Column 59 (Jul 2005)

[suggested title: ``Nicer configuration files'']

I see a lot of configuration file modules in the CPAN. Perhaps too many. Perhaps this is one of these tasks that (like so many others) attracts every new junior Perl programmer into the ``I can do better than those'' mindset.

I can't claim to have used every one, and perhaps I haven't even looked at every one. However, let me spend a bit of time talking about one that I've used rather heavily on a recent project, and am definitely liking for future projects.

One of my favorite clients is http://geekcruises.com, mostly because the typical work location for Geekcruises is aboard a cruise ship travelling between the Caribbean islands. I recently completed a major overhaul of their administrative interface to the ``booking engine'': the portion of the website that makes money for them by allowing random geeks to sign up for future cruises.

The core of my overhaul used the emerging standard CGI::Prototype framework, along with Template Toolkit and Class::DBI for the data model. One of the problems I face when programming is how to write each important fact in only one location, because cut-and-paste is the maintenance nightmare. I found myself wanting to record meta-information about each of the database fields (there are some 500 columns in 50 tables for this application) in a common location.

One interesting problem is that I wanted to access the information from both Perl code and Template Toolkit code, because a lot of the meta information bridges the ``model'' (the precise column of that table) to the ``view'' (how the user sees the data, and how the inputs are interpreted). I decided rather early on that I wanted to call a method named Config against a given Class::DBI table class, or a specific row instance, and that this should return a hashref of all the meta-information I would know about this row. In particular, the hashref would contain keys for all the columns of interest, resulting in another hashref of attributes about that column.

Thus, for a given row in $row, I could obtain (in Perl) the meta information about that row with:

  $info = $row->Config->{$column};

Or, in Template toolkit code:

  info = row.Config.$column;

After deciding how I wanted to see the data, I then considered how to store the information. I already had a class definition for nearly all of the tables of interest. I thought about just adding:

  my $config = {
    column1 => { width => 10, field => 'textfield' },
    column2 => { width => 50, height => 5, field => 'textarea' },
  };
  sub Config {
    my $self = shift;
    return $config->{+shift};
  }

But then I started to realize that there'd be a lot of common values amongst these various things. Common values, with some overrides. And then I stumbled across a reason for the inheritance feature of Config::Scoped, which I had seen in the CPAN a few months before.

The Config::Scoped module initially looks like almost every other configuration file parser. You give basic stuff, and get basic values back:

  param1 = foo;
  param2 = [ 1, 2, 3];
  param3 = { a => hash };

which when parsed, returns:

  '_GLOBAL' => {
    'param1' => 'foo',
    'param2' => [ '1', '2', '3' ],
    'param3' => { 'a' => 'hash' },
  }

Note that Perl's arrays and hashes are supported directly, and that barewords don't need to be quoted. I was already liking this module at this point, mostly because I hate quoting obvious things. And more complex data structures are trivial, with:

  scalar = bar;
  list = [ bar, baz ];
  hash = { bar = baz, goofed = spoofed };
  lol = [ [ foo, bar, baz ], [ 1, 2 ], [ red, green, blue ] ];
  hol = { color = [ red, green, blue ], goof = [ foo, bar, baz ] };
  loh = [ { bar = baz }, { goof = spoof } ];

But the module goes beyond this, in permitting declarations:

  column1 {
    width = 10;
    field = textfield;
  }
  column2 {
    width = 50;
    height = 5;
    field = textarea;
  }

And there's my column hashes. Each declaration creates a nested hash, so column1 has a hashref with width as a key and 10 as a value. Nice.

But then comes the blocks, and here's where it gets fun. Suppose I had five textfields, and wanted them all to have a width of 10:

  {
    field = textfield;
    width = 10;
    column1 {}
    column2 {}
    column3 {}
    column4 {}
    column5 { readonly = 1 }
  }

Here, columns 1 through 4 all end up inheriting everything visible at this scope level, which are the two keys with field and width. But column5 gets all that plus a readonly key. Blocks can also be nested.

With a pragma of %warnings parameter off, I can also define defaults with overrides:

  {
    field = textfield;
    width = 10;
    readonly = 0;
    column1 { }
    column2 { }
    column3 { width = 20 }
    column4 { readonly = 1 }
    column5 { width = 15; readonly = 1 }
  }

Now I was seeing how this was going to save me some time. In addition to barewords, any Perl-style quoted string is also permitted, allowing me to record things like column headers and footnotes:

  {
    field = textfield;
    width = 10;
    f_name { head = "First name" }
    m_name {
      head = "Middle initial";
      foot = "Single letter please";
      width = 1;
    }
    l_name { head = "Last name" }
  }

So in my template toolkit code, I can generate the appropriate headers and footnotes:

  [% columns = ["f_name" "m_name" "l_name"] %]
  [% FOR row IN rows %]
  [%# if this is the first row, label the columns %]
  [% IF loop.first %]
  <table>
  <tr>
  [% FOR col IN columns %]
  <th>[% row.Config.$col.head %]</th>
  [% END %]
  </tr>
  [% END %]
  <tr>
  [%# dump the data values %]
  [% FOR col IN columns %]
  <td>[% row.$col %]</td>
  [% END %]
  [%# if this is the last row, close the table %]
  [% IF loop.last %]
  </table>
  [% END %]
  [% END # FOR row %]

Now this was really starting to make sense. I could parameterize a lot of the generic templates, driving them from the data in the nice config tables.

But wait... there's more. With the %include pragma, I can also include common values:

  %include common.cfg;
  f_name { head = "First name" }
  l_name { head = "Last name" }

where common.cfg contains:

  field = textfield; # by default
  width = 10; # by default

And then I can add additional things there, like special getters and setters for model-to-view-to-model mapping:

  get = get; # default
  set = set; # default

And meanwhile, in a nearby configuration file for credit card numbers:

  %include common.cfg;
  expiration_date {
    head = "Expiration date";
    get = get_mmyy;
    set = set_mmyy;
  }

Then I merely had to ask in the Perl code ``how do I get this'', and call the right method:

  my $getter = $row->Config->{$column}->{get};
  my $value = $row->$getter($column);

For most columns, this defaults to Class::DBI's get method, but I could just override the ones that required special care. This even works in Template Toolkit code:

  getter = row.Config.$column.get;
  value = row.$getter($column);

Wow. Thank you, Template Toolkit, for unifying hashes and method calls.

I haven't even gotten to the part where you can define macros and embedded Perl code in your configuration files, because I haven't used those features enough to say something sensible about that. However, let me leave you with this piece of code that I created to automatically parse the correct configuration file for a given database table class:

  sub Config {
    my $self = shift;
    my $class = ref $self || $self;
    ## compute filename relative to me, based on my packagename
    my $p = __PACKAGE__;
    (my $s = $class) =~ s/^\Q$p\E:://
      or die "$p is not prefix of $class!";
    $p = __FILE__;
    $p =~ s/\.pm$// or die "$p doesn't end with .pm!";
    require File::Spec;
    my $file = File::Spec->catfile($p, split '::', $s) . ".cfg";
    my $config = do {
      if (-e $file) {
        require Config::Scoped;
        Config::Scoped->new
            (file => $file,
             warnings => {qw(permissions off parameter off)},
            )->parse;
      } else {
        {}
      }
    };
    {
      no strict 'refs';
      *{$class . '::Config'} = sub { $config };
    }
    return $config;
  }

I place this in my base class for all of my Class::DBI table classes. If I then call the Config method of any derived class, it initially winds its way back up into this class, which figures out a cfg filename for the derived class, located alongside the .pm file (in the same directory). The file is then parsed with Config::Scoped to create the correct hash. However, to keep from repeating this work more than once in a program invocation, a new method is installed in the derived class to return the constant hash.

I'm probably releasing this simple piece of code to the CPAN soon, perhaps under the name Class::DBI::Plugin::ConfigScoped, or something like that. It'll probably be there by the time you read this.

I hope I've demonstrated Config::Scoped enough so that you'll finish reading the documentation for it. Until next time, enjoy!


Randal L. Schwartz is a renowned expert on the Perl programming language (the lifeblood of the Internet), having contributed to a dozen top-selling books on the subject, and over 200 magazine articles. Schwartz runs a Perl training and consulting company (Stonehenge Consulting Services, Inc of Portland, Oregon), and is a highly sought-after speaker for his masterful stage combination of technical skill, comedic timing, and crowd rapport. And he's a pretty good Karaoke singer, winning contests regularly.

Schwartz can be reached for comment at merlyn@stonehenge.com or +1 503 777-0095, and welcomes questions on Perl and other related topics.