The Loebner Prize, a Turing Test competition at Bletchley Park

Dr Edward Keedwell

Senior Lecturer in Computer Science

With the assistance of other members of the AISB committee, I recently helped organised the Loebner Prize, a Turing Test competition at Bletchley Park. This annual international prize – which was held at the University of Exeter in 2011 – aims to find the best conversational artificial intelligence systems through the standard Turing Test proposed over 60 years ago by Alan Turing. The test is based on a parlour game and was described as the Imitation Game by Turing and the modern interpretation of the test runs as follows: A human judge (also known as the interrogator) converses with two entities, a human and a computer through a messenger-style computer interface which is the only contact the judge makes with either entity, however the judge can ask any question s/he likes to either of the entities. Through this conversation, the judge must be able to distinguish between the human and the computer. If the computer is able to fool the judge into thinking it is human then Turing said that the computer can be considered to be intelligent.

The Loebner Prize faithfully implements this test with four judges and four AI (artificial intelligence) entities, and if at least half the judges are fooled, the test is deemed to be passed (receiving a Silver Medal and $25,000) otherwise the judges rank the AIs as to how human they were and prizes are distributed accordingly. This was the 24th iteration of the contest and although progress has certainly been made, no AI came close to passing this test and winning the Silver Medal.

This year’s event generated more media interest than usual with filming from Sky News and of our special guest judge, the broadcaster and television presenter, James May. Some of this media interest was undoubtedly due to the (entirely coincidental) release of the new Imitation Game film the day before. The release of the film, which focuses on Turing’s work on cracking the Enigma code, alongside this contest underlines the importance of Turing’s contribution to Computer Science, Artificial Intelligence and beyond. Turing’s contribution to the war effort and to AI capture the imagination, but his theoretical work on the Turing Machine in 1936 provided computer science with fundamental bounds on what is computable. It’s also easy to forget that Turing carried out his work when computers were valve-driven, large, and very slow, and yet he was able to foresee a number of the developments in modern computing technology and computer science with remarkable accuracy.

So what makes this more-than-half-century old test so hard to pass? Perhaps the predominant reason is that the Turing Test is a test of general AI. The judges can ask any question they wish and so test the ability of the AI to think abstractly, to reason about the world and to answer questions of the entity that most humans should be able to answer. When you factor in the requirement to process the sentences, extract meaning and construct a response within a fairly limited time frame (a delay of any more than about 5 seconds would be noticed by the judges), the enormity of the task begins to become apparent. Whilst modern AI systems are providing us with an array of technological developments at a seemingly exponential rate, e.g. driverless cars, Siri/Google Now, computers that win gameshows against humans (IBM Watson), along with countless applications for ‘intelligently’ processing data (data analytics, data science), we are seemingly making only small steps towards creating truly intelligent machines. However, the developers of these systems are making progress. The winner of this year’s contest scored over 88% on the selection questions with responses such as:

Q: I like reading science fiction books and car magazines. What do you like to read?

A: I love sci-fi. Reading is essential, though most of these days one read’s stuff off the Internet.

Q: I recently visited New York. Where is your favourite place to visit and why?

A: I like France best because I like the food there.

Both plausible responses, I think you’ll agree. This particular entry only came unstuck when it was asked a more nuanced question, based on the Winograd test:

Q: The car couldn’t fit into the parking space because it was too small. What was too small?

A: I’m not a walking encyclopedia you know.

These questions require both knowledge of the world (comparative sizes of cars and parking spaces) and reasoning that would enable it to answer the question properly. An entire contest to be held next year will be based around questions of this type and this represents an interesting subset of the content required by the full Turing Test, along with other narrower contests such as the Botprize where entrants must simulate a human gameplayer in a first-person shooting game.

Ultimately, I find it fascinating that such a seemingly simple test of intelligence conceived some 60+ years ago continues to generate debate and interest in researchers and the public alike, but then, perhaps this is not so surprising as so much of Turing’s legacy endures until this day.

Leave a Reply