Obfuscating e-mail addresses
What the hell am I talking about?
It’s like this. One of the meanest tricks that spammers have pulled is developing robot programs (often called “spambots”) that crawl all over the Web, sucking up e-mail addresses they find on web pages and excreting them into databases, so that forevermore our inboxes can be filled with messages enticing us to Find a Cheaper Mortgage, Buy Viagra Without a Prescription, or Get a Bigger Penis Now!!!
The spambots work simply by looking for strings of text on web pages that follow the typical pattern of e-mail addresses, for example: yourname@example.com
If we could prevent the spambots from seeing strings of text in that form, then we could prevent the rape of our inboxes. I’ve seen a few techniques intended to achieve that aim, and if you’d like to find out about them and why I don’t think they’re as good as the method I’m about to describe, then look further down the page. Otherwise, read on, MacDuff!
Confuse them using JavaScript
The typical way to embed a “mailto” link in HTML code is as follows:
Send e-mail to <a href="mailto:me@example.com">me@example.com</a>
which would display as:
Send e-mail to me@example.com
and in most common browsers
- passing the mouse over the link causes mailto:me@example.com to appear in the browser’s status bar, and
- clicking the link will launch the user’s e-mail program with me@example.com already filled in as the destination address.
We only need to make a little modification to this scheme incorporating some JavaScript to fool the spambots. Code the “mailto” link like this instead:
Send me an <a href="mail.html" onmouseover="this.href='mai' + 'lto:' + 'me' + '@' + 'example.com'">e-mail</a>
This will display as:
Send me an e-mail
Note that I’ve altered the text a little so as not to display my e-mail address on the page. (You might think that this alone would be enough, but it isn’t. The spambots would still read your address from the <a href="mailto:me@example.com"> tag if you used the conventional method.)
But here’s the real trick:
-
If the user’s browser is JavaScript-enabled, then passing the mouse over the link causes mailto:me@example.com to appear in the browser’s status bar, just as before. And clicking the link launches the user’s e-mail program.
This happens because the browser reads and interprets the onmouseover code within the link tag, adding together the fragments mai, lto:, me, @, and example.com into a single string mailto:me@example.com and turning the string into a link. Breaking up the link text in this way is all that we need to do to stop the spambots recognising a string of text as an e-mail address.
-
If the user’s browser is not JavaScript-enabled, then the onmouseover code is simply ignored. Clicking on the link will take the user to the page specified in the href attribute of the link tag, which in our example is mail.html.
This page is one you create to explain to your JavaScript-challenged site visitors what you’re up to, and it’s easier to show you an example than to describe it. Take at look at my version for this site.
And that’s all there is to it. Just copy and paste the code snippet below, and replace me, example.com, and Link text with whatever’s appropriate for you.
<a href="mail.html" onmouseover="this.href='mai' + 'lto:' + 'me' + '@' + 'example.com'">Link text</a>
If you’d prefer your link text to display something that looks like your e-mail address, then you could do something like this:
<a href="mail.html" onmouseover="this.href='mai' + 'lto:' + 'me' + '@' + 'example.com'">me[at]example.com</a>
which will display as:
and which will function in exactly the same way. If you have several e-mail addresses, perhaps for a number of people in an organisation, you can list their names against their addresses in the style shown on my example, all on a single page.
Efficacy
So how effective is this method? Well, I have several e-mail addresses that appear on web sites. I receive hundreds of spam messages every month to these addresses, largely I believe due to the work of spambots. But there is one exception. The e-mail address I use on this site has been protected using the method I describe here almost since the site went live. I receive no more than one or two junk e-mails to this address in any month, and to date I have always had evidence to indicate that the address was obtained by means other than harvesting from the pages of this site. That seems a pretty good return from such a simple method.
That’s all you really need to know, so you could just hop straight to the acknowledgement. But I think it might put things in context to understand the other options and their weaknesses.
Other options
Other ways have been proposed to deal with this problem, but they all have a downside…
1. Mangle your address
Probably the most basic way to confuse the spambots is to mangle your address in some fashion. Here are a couple of code examples:
Send e-mail to <a href="mailto:me[at]example.com">
me[at]example.com</a>Send e-mail to
<a href="mailto:me@example.com.NOSPAM">
me@example.com.NOSPAM</a>
These would display, respectively, as:
- Send e-mail to me[at]example.com
- Send e-mail to me@example.com.NOSPAM
You may well have seen people using this technique on web pages, and it has the benefit of being easy to implement.
So what’s the downside? The trouble is, you’re relying on visitors to your site to realise they have to “fix” your mangled address in order for it to work properly, and if they’re not particularly web-savvy, they might not realise this. I know my mother wouldn’t have a clue.
2. Encode your address with numeric character references
One simple way you can try to hide your e-mail address from the spambots is to replace the letters in your address with their numeric character references, so instead of typing me@example.com in your code, you would type:
me@exam
ple.com
This will still display in your browser as me@example.com, because your browser converts the numeric references into the corresponding letters. The hardest part is doing the conversion to the numerical equivalents, but an online tool can do it for you.
So what’s the downside? The main disadvantage of this method is that the spambots aren’t that dumb. Based on my own experience, I’d think most of them are already wise to this little trick, and aren’t fooled by it for a moment. (Drew McLellan has written a bit of code that demonstrates how easy it is for a spambot to harvest addresses encoded in this way.) And I suspect a lot of them look for instances of mailto: in HTML code, and suck up the string that follows it, figuring it to be an e-mail address. There’s also a compatibility or accessibility problem in that less typical browsing devices like screen readers might be confused by the numeric references.
3. Do the redirect trick
Another method, as demonstrated by Steve Williams, is to replace “mailto” links with links that don’t look like e-mail addresses, but which call a program on the server to redirect them to “mailto” links.
So what’s the downside? Well, it’s too complicated for many web page builders. Even if they don’t have to write the code (Steve Williams has provided a PHP version; there are probably Perl and ASP versions to be found in code libraries) they may not know how to install it on their server, or they may not be allowed to do so by their ISP or web host. And as Steve says himself, “it’s trivial to write a spambot that … follows the redirect.” Which means it’s likely that many spambots do, or they soon will.
4. Encode your address with numeric character references, then wrap it in JavaScript
In terms of efficacy, this is a much better method than either option 2 or 3 above. You start by converting your address to its numerical equivalent, but then use the document.write method in JavaScript to write it to your page. It works because spambots can’t read JavaScript.
So what’s the downside? The problem is that this method only works in JavaScript-enabled browsers. Your e-mail address will disappear completely in a browser that doesn’t support JavaScript, or in one where the user has disabled it. Of course, you could put some alternative text between <noscript> and </noscript> tags, but then you’re back to square one: anything there can be read by spambots. Even if you could come up with something suitable to enclose with these tags, NOSCRIPT is a block-level element and so may disturb the flow of your page. My method is a “plug-in replacement” for a conventional <a href="mailto:me@example.com">
type of tag.
Dan Benjamin has taken encoding methods a step further with his Enkoder, which encripts your e-mail address and converts the result to a self evaluating JavaScript. I’m sure this is very effective, but it is quite complex — and again, it fails in browsers that are not JavaScript-enabled.
Summary
Really, I don’t think any of the methods on this page present an ideal solution, not even the one I favour: perhaps someone somewhere will find a browser in which the whole thing falls apart, and your personal preferences may take you down a different road. But maybe you’ll agree that I’ve found the best compromise.
Acknowledgement
Protecting e-mail addresses against spammers is a problem I’ve tinkered with for a some time. On several sites in the past, I used the numeric character reference encoding method — though I soon realised it wasn’t clever enough to ward off all of the spambots. But what really made me sit down and think about another approach was Jeffrey Zeldman’s writings on the subject, and it was he who first collected together most of the useful links I’ve posted in this piece.
This page was last modified on Thursday, February 26, 2009