Law Practice Magazine — July/August 2006
PERSONAL TECHNOLOGY
Features
Web-Based Weapons in the Battle Against Spam
Having been online since 1989, I have seen a ton of junk e-mail. But my PC's e-mail application, Microsoft Outlook, had developed a pretty good client-side spam filter that ensured I didn't see much of the junk headed my way. It wasn't perfect, but it was simple, and it stopped the vast majority of spam from hitting my inbox.
When my Treo smartphone and my Weblog entered the picture, however, I had to grapple with some serious spam issues for the first time. For everyone who has a smart-phone or a blog (or is thinking of one), here's a review of what happened and how I addressed the resulting problems.
Using Web-Based E-mail to Unclog Your Inbox
When I bought a Treo to gain access to my e-mail while on the road and away from my PC, it quickly became evident that a client-side approach to battling spam wasn't for me. I was seeing e-mail on my phone before my PC's filters could clean it up—meaning that all that junk was now clogging my phone, creating exceedingly long download times, and making it much harder to get any value out of the phone. I needed a better answer.
One came from my Web host, which offered an installation of SpamAssassin, an open source tool for blocking spam on the server. Like Outlook's junk-mail filter, SpamAssassin isn't perfect—but it provided a significant improvement over seeing a junk-filled inbox on the Treo. For about six months, that provided all the protection I needed.
This is when I ran into an interesting problem: While the open source community was updating SpamAssassin on a regular basis, my Web host was not. This meant I was using an increasingly dated version of the service, which resulted in newer and savvier junk mail finding its way through SpamAssassin's filters. My "clean" mailbox was getting dirty again. Fast.
Around the same time, I joined my current employer, FeedBurner, and gained a new e-mail box. There isn't a spam filter on our server so it is up to us to filter our own e-mail. As I juggle a few responsibilities here, I have a number of e-mail aliases (support, business development and client services, for a few) at which I receive messages. So instead of getting one copy of a piece of junk mail, I'd frequently get five or ten copies of it. My spam problem just got a lot worse.
In response, rather than use my own mail server, I've transitioned to Gmail, which is a free Web-based e-mail service owned by Google. While there are commercial providers out there, I've found Gmail to be the most feature-rich and user-friendly service for my needs. It includes a pretty robust spam filtering service and it gives me nearly 3 gigabytes of storage. (The actual number's always climbing.) I can still use Outlook and my Treo to access messages in my Gmail account, and Gmail also acts as a permanent archive of e-mail I want to keep. I can log into Gmail from any computer and see all e-mail I've received, as well as respond to any e-mails that need attention.
As a result of this approach, the number of junk e-mails I see each day has fallen dramatically. At its peak, I was seeing up to 700 spam messages each day! I now see about a dozen or so. In addition to saving considerable amounts of time by not having to delete the deluge of spam, I have an always-accessible online backup of my e-mail.
A cautionary note: While Gmail is more than adequate for my personal purposes, remember that you get what you pay for. In what can only be termed a cruel irony, Gmail blocked an editor's e-mails identifying the deadline for this article's submission. If you simply cannot abide the chance of false positives in your business or personal e-mail, you may want to investigate other solutions.
Blocking Comment Spam on Your Blog
It took more than a few years for our e-mail boxes to become cluttered with sundry spam solicitations of all types—but it took just a year or two for the spammers to overrun Weblogs. Blogs are a highly desirable target owing to the way they promote the most valuable currency in a search-driven Internet world: links.
Spam in blogs, called "comment spam," is the result of junk-mail programs automatically posting random comments. But comment spammers don't just leave text messages in blog comments, they leave code that produces links to their Web sites. Unsuspecting bloggers thus unwittingly participate in a massive search engine fraud—thousands of links to any number of poker, prescription-drug and other sites will appear on blogs in a matter of days, with the comment spammers hoping to game Google into thinking their sites are, in fact, authoritative sources for the terms used so their search engine rankings will increase.
My blog fell victim to this scheme until recently, when I converted my blog application to WordPress (from Movable Type). In the month since I converted, more than 1,100 spam comments have been sent to my blog. Lucky for me, every single one of them got blocked. The reason: The same team behind WordPress has created a tool called Akismet (http://akismet.com).
Akismet is a centralized Web-service that will evaluate every comment submitted for telltale signs of spam. It then returns a thumbs-up or thumbs-down response for the submitted comment. WordPress, in turn, acts on that response in one of three ways: by publishing it, queuing it for moderation, or junking it altogether. So far, it has produced not a single false positive and hasn't let any spam through. It can't get much better than that!
Akismet has gained popularity and increased in effectiveness because of its community contributions. In other words, as it sees hundreds of thousands of comments a day from an ever-growing community of sites, it gets smarter about what is and what is not spam. As a result, it keeps "learning" the latest techniques employed by spammers and gets better at blocking unwanted comments. A number of other blog platforms (including Movable Type) now have their own implementations of Akismet as well. See http://akismet.com/development for the latest ones. (Also see the sidebar for additional defenses against comment spam.)
Oh, and did I mention that Akismet is free for personal use? Let's all hope that with the proliferation of free and effective tools like Akismet and Gmail, we can finally gain some ground against those canny spammers.
About the Author
Rick Klau is Vice President of Business Development at FeedBurner, an RSS feed management service for bloggers, podcasters and commercial publishers. His blog is at www.rklau.com/tins. He coauthors Law Practice's nothing.but.net column and is a coauthor of The Lawyer's Guide to Marketing on the Internet, 2nd Edition (ABA, 2002).
Some Other Approaches to Stopping Comment Spam
CAPTCHA. It's a mouthful, but CAPTCHA is shorthand for "Completely Automatic Public Turing Test to Tell Computers and Humans Apart." Named after a theory developed by British mathematician Alan Turing, it's a computer-generated test to see whether the blog application is interacting with a human or a computer. CAPTCHAs are most often performed by displaying a word or series of numbers in a font that is not machine-readable (wavy letters, grainy presentation or other means are used to obscure the text), then asking the individual to match the phrase displayed. If users get it right, then they're probably legit. Most comment spammers employ automated techniques for posting lots of comments at a time and wouldn't take the trouble to post them one at a time. CAPTCHAs are effective, but they can be frustrating for users—and for the visually impaired, they're often impossible to use.
Registration. Many blog applications can be configured to require registration before allowing comments to be posted. However, this can impose a burden on the end-user if the registration process is cumbersome and it can limit legitimate conversation. Will users go through the trouble just to leave a quick comment? Perhaps not. But it can certainly block illegitimate comments quite easily.
Moderation. The most cumbersome of approaches, it requires the blog owner to approve comments before they're published. While it will ensure that only good comments get published, it imposes a significant burden of time on the blog owner, who must manually skim each comment that's left. And if you get dozens or sometimes hundreds of comments a day on your blog, like I do on mine, this can consume a significant amount of time.