Copyright Notice

This text is copyright by InfoStrada Communications, Inc., and is used with their permission. Further distribution or use is not permitted.

This text has appeared in an edited form in Linux Magazine magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.

Please read all the information in the table of contents before using this article.
Download this listing!

Linux Magazine Column 67 (Feb 2005)

[suggested title: ``Creating Class::DBI relationships automatically'']

I've recently been contracted by Geekcruises (http://www.geekcruises.com) to update their online booking and administration forms. They're using a simple CGI architecture, with a lot of hand-rolled SQL to talk to their PostgreSQL database.

My first design decision was to adopt a more generic flexible web abstraction, creating CGI::Prototype as a result. They've allowed me to upload that to the CPAN, and I'll be writing more about that in the future. For the ``model'' part of the abstraction, I've settled in on the wildly popular Class::DBI framework. And that brings us to today's problem and solution.

The existing database consists of 50 tables, with typical foreign key mappings. For example, a person_cruise record links a person with a cruise. In Class::DBI, setting up these foreign_key relationships so that they could be easily accessed looked simple enough: I'd just have to set up a bunch of statements like:

  GC::DB::PersonCruise->has_a(person_id => 'GC::DB::Person');
  GC::DB::Person->has_many(personcruises => 'GC::DB::PersonCruises');
  GC::DB::PersonCruise->has_a(cruise_id => 'GC::DB::Cruise');
  GC::DB::Cruise->has_many(personcruises => 'GC::DB::PersonCruises');

And that was fine, for the first 10 or 15 records. And then it started to get boring. And then tedious. And then error-prone. I started to think ``hey, wait a minute! PostgreSQL knows these relationships! Why am I reentering that information?'' And then ``Oh, someone else must've done this already!''

When I started looking in the CPAN for some reusable solutions, I quickly found Class::DBI::Loader. Joy! This module solves the problem of querying the database for all the records and their columns, including determining the auto increment sequences. That's good, because that was already starting to be frustrating as well.

But Class::DBI::Loader did nothing about the foreign keys. So, as I kept looking, I stumbled upon Class::DBI::Loader::Relationship, which started to look promising (I could use English instead of Perl), but alas, even the example has a bug in it, and I just couldn't get it to work like I wanted. Also, again, why am I specifying all these foreign keys when the database itself already knows?

After staring at the DBD::Pg manpage for about a half hour, and trying out little snippets in the debugger, I was finally able to coax the database into telling me about its foreign keys. Then I took some of the ideas from Class::DBI::Loader::Relationship to turn those into has_a and has_many calls (including the idea of automatically making plurals). And after a bit of hacking, I had my automatic Class::DBI generator.

However, I ran into one other snag. The database had used PostgreSQL's table inheritance for the payment types. A specific credit card, money order, or check payment inherited from a generic payment record, so that an SQL query could ask for ``any payment'' versus ``any credit card payment'' easily. Since these child record types didn't have an explicit foreign-key relationship with the linking records, I had to add a means of faking it in my code.

The result of all of this is in [listing one below]. Lines 1 and 94 define this as a package GC::DB, which I will require or use in my application.

Lines 4 through 90 create a lexical scope so that I can define temporary variables during the setup that won't apply to any of the later code.

Lines 6 and 7 set up the database connection, user, and password for this database. They can be overridden for testing purposes by setting @GC::DB::CONNECTION before bringing in this file. More on that in a future column as well.

Lines 9 and 10 bring in two CPAN modules. I'm using require rather than use because I don't want any imports to pollute GC::DB class.

Line 12 turns on debug tracing mode. This was helpful when I was developing, to see what relationships were actually being generated.

Lines 14 through 18 use Class::DBI::Loader to create the 50 Class::DBI classes reflecting the 50 records of the database. The namespace parameter gives GC::DB:: prefixes to these new classes. This call also sets up all the sequence calls so that key generation works properly.

Lines 19 and 20 extract out the classes and tables just generated. I'm presuming the two lists correlate, but have no assurance of that from the docs. In fact, the docs don't even mention these two methods.

Line 22 gets me a database handle by calling db_Main on the first class. Any class would have done, as they all share the same database.

Lines 26 to 29 add our current class to the @ISA list for each of the created classes. I want to be able to define methods that all of my created classes understand, and this was the fastest way to do that.

Lines 33 and 34 set up the %class_of hash, so that I can go from a table name to a class name. Again, I'm presuming that the two lists are correlated properly.

Line 36 patches up a plural that didn't come out quite right for me. The class named GC::DB::PaidTo was getting a linking method called paidtoes, which was humorous but expectedly hard to remember. So, I just whacked the pluralizer on the head with an additional fact, and it's all good.

Lines 38 to 60 define a ``lexical subroutine'' as a coderef stored in a lexical variable. Again, I'm trying hard not to pollute the GC::DB namespace. This subroutine takes a table name, a linking column in that table, and the other table to which the column links, and turns those into the right has_a and has_many calls.

Lines 41 and 42 use the class name hash to get the corresponding class names for the two table names.

Lines 44 through 48 provide a debugging trace of the has_a call that we're about to execute in line 49. This creates our forward link (many to one).

Lines 51 and 52 determine the plural reverse link by first extracting the ``basename'' of the class, then lowercasing that, then pluralizing that.

Lines 54 through 58 provide a debugging trace of the has_many call that we're about to execute in line 59. This creates our reverse link (one to many).

And now that we have that subroutine to do the dirty work, we just need to figure out how to call it!

Lines 64 through 68 define a hash to map a table name into all of its child tables, which apparently we cannot determine automatically from the DBI layer. Oh well.

Lines 70 through 89 loop through all the tables we've seen. Line 71 traces the particular table being examined.

Line 72 uses the DBI abstraction to get the foreign keys for a given table (indicated as the sixth positional parameter). The return value from foreign_key_info is a statement handle that can be used with the normal DBI methods to extract the data. If there are no foreign keys in this table, $sth is undefined, so we need to skip the rest of the loop.

Lines 73 through 87 loop through the results from getting the details of the foreign keys as an array of hashrefs. Each $res is a hashref for a particular foreign key mapping.

Lines 75 and 76 extract the linking column and the referenced table from the result.

Lines 78 and 79 deal with the PostgreSQL inheritance by possibly replacing a single table name with all of its related tables. Most of the time, @tables will consist of only $table, but when $table was payment, we'll instead have the entire list.

Lines 81 through 85 execute the $has_a_many subroutine for each combination of source table, foreign table, and column that we've now noted.

Line 92 is where I will be adding other things for each class. For example, if I need to add a method to the GC::DB::PersonCruise record, I'll merely add:

  {
    package GC::DB::PersonCruise;
    sub my_extra_method {
      ...
    }
  }

to the mix, keeping all of my database model code in this one file.

The lesson I'm reminded here is that when things seem tedious, it's time to get the computer to do more of the work. And I'm happy that I have a language like Perl that can generate and execute code on the fly, including late-binding the method calls to the objects and classes created at run-time. Try that in Java. Not gonna happen. Until next time, enjoy!

Listings

        =1=     package GC::DB;
        =2=     use strict;
        =3=     
        =4=     BEGIN {
        =5=     
        =6=       our @CONNECTION;
        =7=       @CONNECTION = qw(dbi:Pg:dbname=gcdb Luser Guessnot) unless @CONNECTION;
        =8=     
        =9=       require Class::DBI::Loader;
        =10=      require Lingua::EN::Inflect;
        =11=    
        =12=      my $DEBUG = 0;
        =13=    
        =14=      my $l = Class::DBI::Loader->new(dsn => $CONNECTION[0],
        =15=                                      user => $CONNECTION[1],
        =16=                                      password => $CONNECTION[2],
        =17=                                      namespace => __PACKAGE__,
        =18=                                     );
        =19=      my @classes = $l->classes;
        =20=      my @tables = $l->tables;
        =21=    
        =22=      my $dbh = $classes[0]->db_Main;
        =23=    
        =24=      ## add mixin of us
        =25=    
        =26=      for my $class (@classes) {
        =27=        no strict 'refs';
        =28=        push @{$class . "::ISA"}, __PACKAGE__;
        =29=      }
        =30=    
        =31=      ## set up the has_a/has_many from the foreign keys
        =32=    
        =33=      my %class_of;
        =34=      @class_of{@tables} = @classes;
        =35=    
        =36=      Lingua::EN::Inflect::def_noun('paidto','paidtos'); # PL() gets this wrong
        =37=    
        =38=      my $has_a_many = sub {        # no lexical subs yet!
        =39=        my ($table, $column, $other) = @_;
        =40=    
        =41=        my $table_class = $class_of{$table};
        =42=        my $other_class = $class_of{$other};
        =43=    
        =44=        warn sprintf("%s->has_a(%s => %s)\n",
        =45=                     $table_class,
        =46=                     $column,
        =47=                     $other_class,
        =48=                    ) if $DEBUG;
        =49=        $table_class->has_a($column => $other_class);
        =50=    
        =51=        my ($table_class_base) = $table_class =~ /.*::(.*)/ or die;
        =52=        my $plural = Lingua::EN::Inflect::PL(lc $table_class_base);
        =53=    
        =54=        warn sprintf("%s->has_many(%s => %s)\n",
        =55=                     $other_class,
        =56=                     $plural,
        =57=                     $table_class,
        =58=                    ) if $DEBUG;
        =59=        $other_class->has_many($plural => $table_class);
        =60=      };
        =61=    
        =62=      ## deal with Pg inheritance
        =63=    
        =64=      my %inherits =
        =65=        (
        =66=         payment => [qw(payment payment_cc payment_check
        =67=                        payment_money_order payment_wire)],
        =68=        );
        =69=    
        =70=      for my $table (@tables) {
        =71=        warn "$table:\n" if $DEBUG;
        =72=        if (my $sth = $dbh->foreign_key_info('','','','','',$table)) {
        =73=          for my $res (@{$sth->fetchall_arrayref({})}) {
        =74=    
        =75=            my $column = $res->{FK_COLUMN_NAME};
        =76=            my $other = $res->{UK_TABLE_NAME};
        =77=    
        =78=            my @tables = @{$inherits{$table} || [$table]};
        =79=            my @others = @{$inherits{$other} || [$other]};
        =80=    
        =81=            for my $table (@tables) {
        =82=              for my $other (@others) {
        =83=                $has_a_many->($table, $column, $other);
        =84=              }
        =85=            }
        =86=    
        =87=          }
        =88=        }
        =89=      }
        =90=    }
        =91=    
        =92=    ## other GC::DB::* things go here
        =93=    
        =94=    1;

Randal L. Schwartz is a renowned expert on the Perl programming language (the lifeblood of the Internet), having contributed to a dozen top-selling books on the subject, and over 200 magazine articles. Schwartz runs a Perl training and consulting company (Stonehenge Consulting Services, Inc of Portland, Oregon), and is a highly sought-after speaker for his masterful stage combination of technical skill, comedic timing, and crowd rapport. And he's a pretty good Karaoke singer, winning contests regularly.

Schwartz can be reached for comment at merlyn@stonehenge.com or +1 503 777-0095, and welcomes questions on Perl and other related topics.