Captchas is a test run to determine that you are human, and screen out the bots, scrappers, and automated spam that finds it way to all things good and wholesome on the internet. You see them everywhere: e-commerce site, blogs, hotmail, myspace, and so on. The trouble is that more and more administrators are finding that advanced screen readers have become better at decoding text than humans with poor vision. The end result is that the most significant impact of adding a poor captchas to your site is to limit the use by your intended audience, the real humans using your site.
Captchas came at a time when hotmail and yahoo were giving away free email addresses and bots were signing up to a 1000 email accounts a day. The first captchas were very simple an effective at reducing or eliminating the flow of spam, however as screen readers have evolved to help the blind and vision impaired it has also had the negative consequence of defeating the traditional captcha.
This splintered the world of captchas into different camps. Some used equations “What is 1 + 1” other used a checkbox “check this box if you are a human” and others used non text solutions all together “select the picture of a cat” to verify your status of “human”. Although effective, most if not all, of these solutions can be programmed around and broken if a spammer took the time to do so.
A third generation captcha that I have used with great success is a backend php solution that dumps form entries if hidden form fields are filled out. The strength of this solution is not that it cannot be broken, but that the form responds normally as if the form was sent and the spammer has no indication that it wasn’t. The main strength of this solution is that it is totally non-intrusive and the average user experiences no inconvenience attempting to guess the jumbled letters in your graphical captcha.
In general, there are several criteria for rating the effectiveness of a captcha:
· Accessibility. CAPTCHAs must be accessible. If a CAPTCHAs based solely on reading text — or other visual-perception tasks — prevent visually impaired users from accessing the protected resource. Such CAPTCHAs may make a site incompatible with Section 508 in the United States. Any implementation of a CAPTCHA should allow blind users to get around the barrier, for example, by permitting users to opt for an audio CAPTCHA.
· Image Security. Images of text should be distorted randomly before being presented to the user. Many implementations of CAPTCHAs use undistorted text, or text with only minor distortions. These implementations are vulnerable to simple automated attacks. For example, the CAPTCHAs shown below can all be broken using image processing techniques, mainly because they use a consistent font.
· Script Security. Building a secure CAPTCHA is not easy. In addition to making the images unreadable by computers, the system should ensure that there are no easy ways around it at the script level. Common examples of insecurities in this respect include: (1) Systems that pass the answer to the CAPTCHA in plain text as part of the web form. (2) Systems where a solution to the same CAPTCHA can be used multiple times (this makes the CAPTCHA vulnerable to so-called “replay attacks”).
· Security Even After Wide-Spread Adoption. There are various “CAPTCHAs” that would be insecure if a significant number of sites start using them. An example of such a puzzle is asking text-based questions, such as a mathematical question (“what is 1+1″). Since a parser could easily be written that would allow bots to bypass this test, such “CAPTCHAs” rely on the fact that few sites use them, and thus that a bot author has no incentive to program their bot to solve that challenge. True CAPTCHAs should be secure even after a significant number of websites adopt them.
All the above captchas fail one test or another – including my own third generation captcha. This has driven web developers to create a more secure fourth generation captcha using a combination of non textual content as well as simple reasoning to defeat the bots and spammers. One example of this captcha asks you to circle all horses in the picture before pressing submit. This captcha requires that you know a) what horses are and b) that you can identify a whole from its various parts (in the event that it was partially concealed). The captcha is also programmed in flash to prevent the script from being seen/decoded and the flash file could have hundreds of randomly rotating images making it impossible to crack, for now.
However, this assumption is based solely on the limitations of AI and computer based cryptography. In this case the limitations in programming AI allow humans to grasp things machines cannot. In the future this will be a win-win compromise either the CAPTCHA is not broken and there is a way to differentiate humans from computers, or the CAPTCHA is broken and a useful AI problem is solved.



