The Death of CAPTCHA
Tuesday, July 1st, 2008For a while now we knew that CAPTCHA’s were becoming irrelevant. There were a great solution when they were first introduced, but I think that everyone knew that they are not going to be around for a long time. The tend in technology is always constant improvement - so OCR engines will continuously improve each passing year. CAPTCHA strength on the other hand has an upper bound because it needs to be human readable. You can continue making the pictures more complex and tricky to solve but at some point they become as incomprehensible to a human being, as they are to some random bot. For example, how do you guys like the rapidshare dog/cat CAPTCHA?

I personally hate that one. Yes, you can sort of figure it out but you actually have to put some effort into it, and sometimes it’s just pure guesswork. Does it help against the automated scripts? I don’t know - I guess this is a question we should direct at Rapidshare. But it sure is annoying to regular users.
The OCR technology is not there yet - it’s getting better, but I presume that we could still get few years out of our CAPTCHA’s if their effectiveness boiled down to complexity of design vs. character recognition arms race. But we all know there is a growing cottage industry out there which uses real people to solve CAPTCHA’s by either tricking them into doing it or paying them per solved puzzle. I always imagined this to be rather shady business conducted in private spammer forums and via private channels. But it is not. They are actually doing this out in the open, as a legitimate paid service:

Here is a screenshot of imagetotext.com - a company which specializes in solving CAPTCHAS. They of course don’t say it like that, but I think the blurbs on their site make it pretty clear that they are not really interested in doing any sort of data entry tasks or into transcribing free hand text into digital format. They are interested in receiving a small image, and shooting back the text at $.02 a pop bought in “packages” of 500 images or more. With a narrow focus like that, what else could they be doing?
Note that I’m not linking to them, because sure as hell they don’t need any Google juice from me.
The ubiquity of CAPTCHA basically created a new niche industry. All you need now is some clever script that will harvest CAPTCHAS, send them to Image to Text, receive responses and create accounts on popular online services. Thank god these sort of scripts are shady, and probably hard to get, right? You either have to make them yourself, or know where to find them, or who to ask for them. It’s not like anyone can just go to a website and buy, for example, an automated Myspace account creator? Right?

This one is from allbots.info - a website that seems to be selling precisely that: account generation scripts that create random profiles, and simply need a human being solving CAPTCHA’s really fast for them. So you buy one of these apps, then purchase a big ass package with ImageToText you can start building your brand new spam empire. All it takes is some cash - you can even be borderline retarded. It won’t slow you down.
Combine the two services, and you have yourself a deadly combo with no programing, and no thinking required. A bit scary if you think about it. I’m not sure how profitable are these two companies, but the fact that they exist indicates that there is demand for these type of services out there.
CAPTCHA’s may be effective in stopping your average home grown spammer, but they are actually creating a whole micro-industry revolving around circumventing them. In other words, they are actually performing natural selection - weeding out the week players with few resources, and leaving only the biggest, baddest and most determined in the game. They are the catalyst, helping to evolve bigger and better bad guys.
Public Turing tests may be doomed and I suspect they might get completely phased out from use on the web in next 5-10 years. And it’s not just CAPTCHA’s - all public Turing tests. After all, it doesn’t matter if you are interpreting an image, solving an equation, or answering a question - it doesn’t really matter if there is a low wage human worker solving it on the other end, and then handing control over to a script.
Google has an interesting idea going on with their text message based application. If you haven’t seen it, try signing up for one of their services such as Gmail or Google App Engine. Instead of using a CAPTCHA they send a text message with an activation code to your cell phone. At least for the time being this system remains much harder to game - which means we might see it being used more and more often by popular online services. Of course it does have serious downsides as not everyone with an internet connection may have a cell phone (think less developed countries) and not all cell carriers may be supported. We will need something else - but what?
It will be interesting to observe where will the anti-bot technology will go in the next few years.

