The Cutest Human-Test: KittenAuth
Spam is a problem and with the recent upsurge in blogs and places where you can leave your 2 cents, its not getting any better. "Clever" online entrepreneurs have always been up for making money from nothing and have relentlessly plagued our inboxes…
Introduction to the problem
Spam is a problem and with the recent upsurge in blogs and places where you can leave your 2 cents, its not getting any better. "Clever" online entrepreneurs have always been up for making money from nothing and have relentlessly plagued our inboxes and now they’re trying to invade blog sites and spread their party-poker based advertising throughout the world of comments. There have been several efforts over the years to stop automated posting though, each claiming their success but all being based on the same methodology.
Current systems look like this. Random text dynamically generated with no meaningful filename (also random). The text is laid on a background that makes it hard to decipher from the text, often sharing the same hues. There are sometimes extra measures taken to make people’s lives that teeny bit more hellish including smudging or cutting some of the text and even overlaying letters to try and confuse a would be spam-bot. In my opinion this is ludicrous. Why is it we live in a world where spam technology limits the way we work? The solution just requires a bit of "thinking outside the box" to see what actually makes sense.
What makes sense?
As with most "genius moments", I was lying on my bed trying to think of a more practical method for allowing anonymous users to posts comments to things on the site.
I primarily wanted a method that didn’t make them have to decipher horrible random text from a box. With OCR constantly improving, this is inherently a failed system… Its more a case of "when", less "if".
I began to start to think of processes that we can do every day with little effort and a computer would either have to be extensively reprogrammed then taught all the possible combinations of entries. I also wanted something that people could customise to their website — eg: instead of everyone use letters and numbers.
Kittens are the answer to all this. Sure you can teach a computer what a cat looks like, and it will probably have a fair shot at picking kittens from alligators but what about when you put it up against other similarly cute animals? Well… I’ve yet to have something try and bodge through it.
The Iterative Design Process
My first spec for this working was very simple. have the client load up 3 images and click the one that was the kitten. I was going to implement this after running it through with my friend Seopher, who pointed out that a bot would have a 1-in-3 chance of fluking it. So that was clearly off the cards.
The next idea pushed back from Seo entailed having dual-authentication — eg making the user type a word as well. Whereas this would be incredibly secure, it’s also incredibly against the spec, so I wrote off that one.
Then I thought about expending the number of pictures that weren’t kittens into a 3x3 grid. But 1/9 is only slightly better than 1/3 for brute forcing the form.
9C3 = 84
But by refactoring this grid idea, I came up with what I’ve implemented over the last couple of days. Making the user pick 3 kittens from a 3x3 grid. Using the combinations statistics work that I used to hate so much — nCr, it told me that there are 9C3 = 84 combinations. That’s a long way off the massive amount of combinations an alphanumeric system gives, but still — The idea is we don’t force the user to break a sweat. I could have increased the amount of picks to 4 and there would have been 126 choices, but 3 is a nice number.
When you’re doing this sort of thing, and you want the most possible combinations, pick the nearest to the mid-count of your grid cells — eg: you have a 4x4 grid and therefore you have 16 cells. If you ask for 3 choices, that gives you a possible 560 combinations. If you ask for 8 (half the total number) you get an impressive 12870 combinations. Not entirely sure on my maths but hopefully someone will clear that up if there are any mistakes in the comments section.
Understand that increasing the rows and columns makes it harder for a user to spot the right cells and that’s why I stuck with 3x3.
Firstly you need a server that’ll let you do the following things:
- Store session information
- Stream out images dynamically (to make sure images cant be traced back to filenames)
- Allow posting of form information
So you’re therefore ok if your server is ASP.net/JSP/PHP or any of the other server side languages that do the above list. Classic ASP wont let you do this unless you have access to an imaging component.
You then need themed images. I went with kittens and other cute animals for the ultimate "awwww" factor. But you could use motorcycles and bicycles, family members and friends or powerpuff-girl characters and dexter’s lab characters… Anything that your user base will understand and be able to decipher. I have mine sorted into 2 directories. One called "1" for images that are kittens and the other called "0" for images that aren’t kittens.
When the form loads I have it create a session based on random numbers. It picks a number between 0 and 9, checks to see if that image has already been placed in the session (if it has, it tries again), and appends the session. So there is a value like "173" sitting in a session-accessible value (hidden from the client).
You then need a image outputter that can read from this session. My image outputter takes one argument — The number in the grid that the image represents. When requested from the client browser, it looks in the session to see if its number is there. If it is, it streams out a kitten picture from the "1" directory. Otherwise it streams out a "0" file. Simple enough.
The page that the form submits to needs to process the action of posting and needs to check that the value of the hidden field has the same contents as the session. If they do, the user has successfully authenticated… If they have not, the user was being lame or its a bot. After either a successful or failed attempt, the session is regenerated to randomise the location of the kittens again.
And that’s it. Nothing very stressful… Or is there? Well most of the processing power in this is going into the random number generation. This happens in 2 parts:
- When the session is created.
- Picking a random "0" or "1" image from their respective folders.
But as far as I can see, this is severely less stressful than making a random string and then generating it on the fly. Repeat: This will cause less server load than string-images.
As an added note, to stop browsers caching the images to a particular filename (causing people to think non-kitten spots were kittens and v-v) the system generates a random double and attaches that to the end of the image name when the form loads but its completely ignored by the image outputting system.
Taking This Further
Typically the community that are most going to want to use this are bloggers that want to secure their blog comment system without making people sign up or sending out emails to their given address and having to come back and auth each message. Therefore the chances are that this system is going to be implemented for blogging systems.
Personally, I have the joy of having my own system and not having to work around someone else’s code so all of this is as easy to implement as writing the code and dropping it in place instead of the submit button but other people will need help getting a system for their blogging software.
I’m looking forward to seeing some larger sites pick this up and see what people can make from it for their own blogs and/or other systems.