The CAPTCHA deters automated spam programs by displaying a distorted image easily read by humans but difficult for computer "bots" to decipher. Most CAPTCHA programs use randomly-generated text, but the Carnegie Mellon technique uses text that was unable to be automatically scanned and interpreted from old books that are being digitized. The idea is that, in answering the CAPTCHA challenge, the human being is also helping to digitize a book that computers could not digitize on their own.
Since the distorted text is selected from words that have already failed to be automatically recognized by state-of-the-art scanning software, it is intrinsically difficult for spammers to circumvent the system. "Firstly, we are starting with words that we know our computers can't read," says Luis von Ahn. "These words have also been distorted naturally over time, and the number of ways they have been distorted is very large."