Paging Through Web Pages
[From Web Techniques Aug '99 issue]

Was it the geek in me, or my role as the director of operations at Bright Light that prompted me to solve that nagging remote-monitoring problem? Sometimes it's hard to know where ideas like these come from, but this one is kind of fun. Pay attention as I show you how you can surf the Web remotely. It's great for system monitoring, getting stock quotes, and checking weather reports, to name a few. Best of all, you can do it all through your pager!

First you need a two-way alpha pager that can send and receive Internet email. I use SkyTel's service with a Glenayre pager (see "Online"). With two-way alpha pagers you can "type" in a recipient's address and a message by moving a cursor over a field of characters and picking the letter you want. You won't get up to 40 words per minute with this method, but it gets the job done.

This particular hack relies on using a Perl script that will accept mail from UNIX standard input. The script then parses the header of the mail for the sender's email address, and looks through the message body for the the URL that the sender wants to retrieve. The script runs Lynx to get the page at the given URL, reformats the output, and mails it back to your pager.

Walking through the code in Listing One, you'll see that the script opens up a couple of logs for debugging, and then starts to parse the mail. The script then parses the header for a "From:" line to retrieve the sender's address and then walks through the body of the message to find the URL.

Getting a valid return address is a little tricky, as addresses can be found in a number of places in a "From:" line, depending on the pager that sends the message. This is why my script does a bit of parsing on this line. I can't say it will get all the different cases but it works for my pager. If the script can't get a good return address, then it will exit with an error and, in most cases, sendmail will bounce the mail back to the sender with an appropriate error message.

Once the script has successfully found who to send the mail back to and the URL to retrieve, the script confirms that the URL is an HTTP request.

Next the script runs Lynx with a couple of critical command-line arguments: -dump grabs the Web page and formats for text displays; and -width=1000 tells Lynx that our "terminal" is 1000 columns, or characters, wide. We do this to prevent Lynx from wrapping words, as the pager will do this for us.

Once Lynx has retrieved a line of the Web page we want, we need to do a bit of formatting on the output. I first strip off any "high" bits to avoid passing strange characters back to the pager. I get rid of any redundant strings such as repeated spaces, hyphens, underbars, or newlines. I also dump the strings Lynx uses to indicate graphics or imagemaps, since we don't care about them.

As some text-intensive Web pages will go over the 10,000-character per page limit, the script will send the first section of the page using bin/mail when it gets over 9000 bytes. It limits the response to three pages as it can be a bit cumbersome and expensive to get all this data back to a pager.

Put this script in a sensible place, like /usr/local/sbin, and then add an entry to your sendmail alias file that pipes mail to a program for some given mail address, say mailbrowse@yourdomain.com.

mailbrowse: "|/usr/local/sbin/mailbrowse.pl"

An entry like this will cause the incoming mail message to be passed as the input stream of the given script, in this case, mailbrowse.pl.

Now you're ready to browse the Net with your pager. Send mail from your pager to your mailbrowse address with the URL you want to retrieve on the first line of the message. With SkyTel it may take 5 or 10 minutes to get the response back.

I've really become dependent on this script. I can check the operations Web pages that show my network bandwidth using Multi Router Traffic Grapher (MRTG), a program used by many ISPs, or check the status of a specific machine using Network Operations Center On-Line (NOCOL), another well-known program. I can even check what is new on our trouble ticket system (see "A Web Interface for Req," Web Techniques, April 1998; also see "Online") through my pager. Also while I'm on the road I can get my daily news at slashdot.org, and even read Web Techniques!

OK, now I can hear you all asking what else you can do with it, like pager-term, pager-cal, and pager-chat. What about pager-console? But before you go off and start designing, remember these pagers don't have pointing devices, and there's a bit of latency. On the other hand, if you really need to browse the Web while you're riding the train to work, your two-way pager and mailbrowse can help you do it.

Gotta go -- gotta mailbrowse and see how my stock is doing at E*TRADE, what tomorrow's weather will be like, and whether my servers are up.


Tim is director of operations at Bright Light Technologies. He specializes in building platforms for folks to communicate with each other. He has built community radio stations, written software to link PCs to the Internet, and started an ISP (TLGnet).