Copyright Notice
This text is copyright by InfoStrada Communications, Inc., and is used with their permission. Further distribution or use is not permitted.This text has appeared in an edited form in Linux Magazine magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.
Please read all the information in the table of contents before using this article.
![]() |
Download this listing! | ![]() |
![]() |
![]() |
Linux Magazine Column 01 (Jun 1999)
[suggested title: Using things from the CPAN]
The Perl community is one of the most well-established demonstrations of the Open Software movement. As a result, many people that have benefitted from Perl's opennness have in turn contributed libraries and scripts back to the public for others to use. The collective contribution for the Perl community has been organized into the Comprehensive Perl Archive Network, known more commonly as the CPAN.
The CPAN is not a single machine, but actually over 100 machines (at
last count) holding the same 750 megabytes (at last count) of cool
stuff, available for your access via anonymous FTP or in a few cases
HTTP. They all mirror off the master site in Finland at various
intervals, usually not exceeding 4 to 8 hours, so it's usally safe to
use the nearest CPAN archive for you. If you're in the US, visit
http://www.cpan.org
. If that site is down (which it occasionally
is), or you're somewhere else in the world, use the URL of
http://www.perl.com/CPAN/
, which will hopefully send you to an
up-and-running site that is close to you.
There, you'll find all sorts of useful things, including the latest release of Perl for Unix and other operating systems. But in particular, you'll find hundreds of pre-written and tested modules that handle many common tasks.
Let's look at two of those modules, and how we might use them in an
interesting way together. The Net::IRC
module provides support for
Perl to handle the Internet Relay Chat protocol. With this module,
you can write a bot -- a Perl program that connects to an IRC
server as if it was an IRC client, and then interacts with that
server in programmatic ways.
And, then there's one of my favorite interesting modules, Chatbot::Eliza
,
which describes itself as:
This module implements the classic Eliza algorithm. The original Eliza program was written by Joseph Weizenbaum and described in the Communications of the ACM in 1966. Eliza is a mock Rogerian psychotherapist. It prompts for user input, and uses a simple transformation algorithm to change user input into a follow-up question. The program is designed to give the appearance of understanding.
This program is a faithful implementation of the program described by Weizenbaum. It uses a simplified script language (devised by Charles Hayden). The content of the script is the same as Weizenbaum's.
So, a few days ago, I wondered, ``what would happen if I took the
Net::IRC
module and hooked it into Chatbot::Eliza
, to make a bot
that acts like a doctor?''. Well, it was easier than I thought, and a
good example of how to reuse existing code from the CPAN.
Installing these modules are trivial, because there's a standard module
that installs other modules directly, called CPAN.pm
. To install
the two modules needed in this program, just do the following:
$ perl -MCPAN -eshell cpan> install Net::IRC Chatbot::Eliza [cpan installation stuff omitted] cpan> quit $
And that's all there is to it! This invocation will fetch the module
source from the nearest CPAN archive, and unpack, configure, make,
test, and install them. You have to be the same user as the user that
initially installed Perl; if you're not, invoke perldoc CPAN
to get
further instructions.
So, let's take a look at this ``eliza-bot'' in [listing one, below].
Line 1 starts the program, giving the path to your installed Perl
binary (here, /usr/bin/perl
for me), and turning on warnings (a
good idea during development, but should always be turned of in
production code).
Line 2 enables the three common compiler restrictions. These
restrictions disable the use of soft references (always a good idea),
disable barewords being treated like strings (keeps random typos from
being ignored), and requires all variables to be either explict
package variables or predeclared with my
to make them lexical. As
all three of these restrictions make sense for programs longer than
say, 10 lines, I usually enable them all like I've done here.
Line 3 increments the value of $|
. If you do this at the beginning
of the program, the STDOUT
filehandle becomes unbuffered instead of
buffered. Here, that's a good thing, because we want to see the
result of each print
operation as it's being generated.
Lines 5 and 6 pull in the Net::IRC
and <Chatbot::Eliza> modules, as
described earlier. These modules must be locatable in the directories
specified in the compile-time @INC
value. If you need to add
directories to @INC
, see the documentation for the lib
module
(by invoking perldoc lib
).
Lines 8 and 9 define two configuration constants for this program,
each with a true/false value. The $IRC_debug
value is passed to
the Net::IRC
module as a debug selection. If false, Net::IRC
stays relatively quiet. But if it's a true value, we get a fairly
detailed tracing of all the steps and callbacks. I set this to 1
while I was figuring out which callbacks would be triggered at what
times with what parameters. But it's completely too noisy during the
real operation, so I've set it to 0 here. And $TRACE
just selects
whether summary messages of everything the bot is doing gets sent to
standard output. These are pretty small, and can probably remain
enabled during actual use, as I've done here.
Line 11 defines a new Net::IRC
object, assigns it to $irc
, and
sets the debugging value for this object according to the value of
$IRC_debug
, as defined above. This object represents the
supervisor of all connections to various IRC servers for this
particular program.
Lines 12 through 15 create an individual IRC connection. In this case,
it's the sole connection for this bot, connecting as nickname
eliza000
to the IRC server at the fictional host name
random.place.not
. Of course, to be useful, you'll need to find an
IRC server on the appropriate IRC network that supports the use of
bots. (Most don't.) When we've come back from the newconn
method,
$conn
holds a Net::IRC::Connection
object -- our main interface
for sending and receiving information.
The primary means of scripting a conversation with Net::IRC
is to
establish a series of event handlers. Each handler is associated
with a particular event (transformed into a Net::IRC::Event
object). When the event is seen (because of something the IRC server
has sent), the corresponding handler is called, along with some
parameters (the connection object, and the event object). So, if you
want to know every time someone sends a public message to a channel
the bot has joined, you'll set up an event for public
.
Lines 16 through 44 set up the particular event handlers that we need
for this bot. I'm using a foreach
loop here (spelled f-o-r
but
pronounced foreach) to walk $_
through a list of arrayrefs.
Each arrayref in turn contains two items: either a single word or
another arrayref, followed by a coderef. The body of the foreach loop
(in line 43) passes these arrayrefs as parameters to the
add_global_handler
method on the connection, resulting in
installing a particular coderef as an event handler for one or more
named events.
So, let's look at each of the event handlers. Each one of these subroutines will be called with two parameters: the connection object itself, and the event object.
The first one, in lines 16 through 18, handle the motd
event,
triggered when we get a single line of text as part of the IRC signon
message. To get that line of text, we start with the arguments to the
subroutine, and take the second element of that $_[1]
, which is the
event object. Then, we call the args
method on that object,
yielding a list, of which the second element is the line of text.
Lots of indirection, but at least it's in there somewhere. Finally,
we dump that line out if we're tracing.
After the motd is complete, we'll get an endofmotd
event, which
is about the first time that we can start issuing commands and
messages. So, I register a handler for that in lines 19 through 23.
Line 20 grabs the connection object into a local $conn
. This is
likely to be the same value as the global $conn
, but I'm following
safe protocol here by taking it from the argument list. Line 21
prints out a trace to let me know I'm all the way in. Line 22 invokes
the equivalent of the IRC command /join #doctors_office
, letting us
start receiving public messages for that channel. If the join failed,
a later event will tell me (which I'm ignoring), so there's no error
checking to be done right here.
Lines 24 through 31 handle a nickname collision. The heart of this
handler is line 28, which uses Perl's magical autoincrement to change
eliza000
into eliza001
and then eliza002
and so on, until we
get one that isn't taken. This works only if the attempted nick is of
a form that Perl can autoincrement, verified in line 27.
Lines 32 through 36 handle all private messages, such as when someone
says /msg eliza000 hi
. In this handler, we'll extract the nickname
from which the message came, and the message itself, and pass those
along with the original connection and event objects to a subroutine
heard
, defined below.
Similarly, lines 37 through 40 handle all private messages, when
someone says a message in a channel that this bot has joined (in this
case, #doctors_office
). Here, I'll call the same heard
subroutine, but pass along the channel name as the source, rather than
the nick name.
Line 45 is the top level loop. This method invocation never
returns, until there's no more IRC protcol happening (usually after a
user gets booted off the server, or issues a quit
command). So,
once we've started, everything else has to be handled within the
context of the various event handlers. Since we established those up
above, they'll now be called as things come in.
Lines 47 through 74 define the heard
subroutine. This subroutine
has two local variables that persist throughout the life of the
program, defined in lines 48 and 49. %docs
is the hash mapping a
particular username to a Chatbot::Eliza
object (a ``doc''). And
%talking_to
keeps track of the nickname to which messages are being
sent, so that a fairly natural-looking nickname prefix can be added
when the doctor changes the way she is ``facing'' in the channel.
Line 52 extracts the four parameters being handed to the heard
subroutine: the connection object, the event object, a nick or channel
name from which the message came, and the message itself. And line 53
dumps that out if we're tracing.
Lines 54 through 57 establish a magic phrase to make the doctor go
away nicely. If any private or public message contains go away
,
then we'll request a quit (with a cute line from Elton John's
Rocketman song), and return before any further processing.
Otherwise, it's time to hand the line being spoken to a doctor. Now,
the doctor memory of prior spoken lines needs to be maintained on a
per-user basis, so the easiest way to do this is to take the user
information, in line 58, and look it up in a hash of doctors
(%docs
), in line 59 to get a specific doctor into $doc
. Now, if
this is the first time the bot has seen a particular user, we'll need
to create a doctor object, in lines 60 through 62. Here, we'll
invoke the new
method in Chatbot::Eliza
to establish a new doctor.
Now comes the slightly embarassing part. While I was playing with this program, I found that the version of Eliza on the CPAN for the past six months (version 0.40) has had a bug in it that causes the memories for various bots to all be shared. And that caused some weird responses -- someone would ask about dogs, and another person would be told ``Earlier, you said something about dogs'', ruining some of the illusion. I've emailed the author with suggestions about a bugfix, but in the meanwhile, I've got to workaround it with line 61. Pay no attention to the man behind the curtain here... if you're using a version greater than 0.40, most likely you won't need to do this.
Once we have a doctor (either new, or returning), we'll tell her what
this user said, in line 64, and get her reply. Line 65 grabs the
nickname of the user that originally spoke the line, and lines 66
through 69 figure out if we're already talking to that person or not.
If not, line 68 prefixes the first line of the response with the
nickname followed by a comma, a common convention. Subsequent
messages will no longer be prefixed, because the nickname in $nick
will match the value of $talking_to{$from}
, the most recently
spoken-to nickname in a particular channel.
Finally, lines 70 through 72 speak all the response lines in sequence,
using the subroutine say
defined later.
Lines 76 through 84 handle the response. Initially, I just used
privmsg
responses for everything, but this violates the agreement
that bots are never to use privmsgs
to people, because it can
trigger a return response that could get into an auto-response
meltdown. (The exact rule is that bots are never to listen to
notice
messages, and are never to generate anything except
notice
messages, but the rules are somewhat relaxed for public
channel messages.) So, I look at who or what I'm talking to, and if
it's a channel, it's a privmsg
(which is the normal way to send
stuff to a channel), and if it's not, it must be a user, so I send a
notice
.
And there you have it. A simple demonstration of two neat things off the CPAN, put together to provide hours of enjoyment. I had one beta tester just spend an hour and a half talking to the doctor while I was writing this. I guess some people don't have anything better to do. Until next time, enjoy!
Listing One
=1= #!/usr/bin/perl -w =2= use strict; =3= $|++; =4= =5= use Net::IRC; =6= use Chatbot::Eliza; =7= =8= my $IRC_debug = 0; =9= my $TRACE = 1; =10= =11= (my $irc = Net::IRC->new)->debug($IRC_debug); =12= my $conn = $irc->newconn( =13= Nick => 'eliza000', =14= Server => 'random.place.not', =15= ); =16= for ([motd => sub { =17= print "motd: ".($_[1]->args)[1], "\n" if $TRACE; =18= }], =19= [endofmotd => sub { =20= my $conn = shift; =21= print "we are IN!\n" if $TRACE; =22= $conn->join("#doctors_office"); =23= }], =24= [nicknameinuse => sub { =25= my $conn = shift; =26= my $nick = $conn->nick; =27= $nick =~ /^[a-zA-Z]+[0-9]*$/ or die "can't fix collided nick"; =28= $nick++; =29= print "nick collision, fixing to $nick\n" if $TRACE; =30= $conn->nick($nick); =31= }], =32= [msg => sub { =33= my ($conn, $event) = @_; =34= my ($msg) = $event->args; =35= heard($conn, $event, $event->nick, $msg); =36= }], =37= [public => sub { =38= my ($conn, $event) = @_; =39= my ($msg) = $event->args; =40= heard($conn, $event, $event->to, $msg); =41= }], =42= ){ =43= $conn->add_global_handler(@$_); =44= } =45= $irc->start; =46= =47= BEGIN { =48= my %docs; =49= my %talking_to; =50= =51= sub heard { =52= my ($conn,$event,$from,$said) = @_; =53= print "heard $from say $said\n" if $TRACE; =54= if ($said =~ /go away/) { =55= $conn->quit("o/~ and all the science, I don't understand... it's just my job five days a week o/~"); =56= return; =57= } =58= my $userhost = $event->userhost; =59= my $doc = $docs{$userhost} ||= do { =60= my $bot = Chatbot::Eliza->new(); =61= $bot->{memory} = []; # bug workaround =62= $bot; =63= }; =64= my @response = $doc->transform($said); =65= my $nick = $event->nick; =66= if (($talking_to{$from} || "") ne $nick) { =67= $talking_to{$from} = $nick; =68= $response[0] = "$nick, $response[0]"; =69= } =70= for (@response) { =71= say($conn, $from, $_); =72= } =73= } =74= } =75= =76= sub say { =77= my ($conn, $to, $what) = @_; =78= print "telling $to $what\n" if $TRACE; =79= if ($to =~ /^\#/) { # a channel =80= $conn->privmsg($to, $what); =81= } else { # a person =82= $conn->notice($to, $what); =83= } =84= }