How close are we to creating an ‘intelligent’ computer? Could a computer ever really be described as ‘thinking’? As a computer scientist, I find these kinds of questions fascinating and it is for that reason that I arranged for the University of Exeter to host this year’s Loebner Prize on 19 October.
The Loebner Prize is an international contest where the entrants compete to create the first computer program that can be described as ‘intelligent’. This follows the eponymous test created by Alan Turing over 60 years ago, which was based on a parlour game known as the ‘imitation game’, and which requires an artificial intelligence computer program to convince a human judge that he is speaking to another human being rather than a machine. The test has been the subject of much debate over the years with John Searle’s ‘Chinese Room Argument’ being the most prominent argument against the imitation game. However despite the arguments, and over 60 years of rapid technological development, the Turing Test remains as difficult to pass as it ever has.
On the face of it the test itself is very simple, 25 minutes of conversation between a human judge, a machine and a human confederate, from which the judge must decide which is which. If he is not able to distinguish them, or selects incorrectly, then the machine can be said to be ‘thinking’. However, this simplicity belies an immensely challenging problem at the core of the test which is the ability to hold a conversation with a human for a significant length of time when s/he is free to discuss any aspect of life s/he chooses. To encode a lifetime’s worth of human experience into a machine is a hugely challenging task and developers are now using the internet to record and digest millions of interactions per day with humans to improve their ability to deal with this problem of scope (Rollo Carpenter’s Cleverbot is a notable example of this – try it for yourself at http://cleverbot.com/).
An interesting development at this year’s contest is the introduction of the Junior Loebner Prize, where the adult judges will be replaced with students from a local school. With less experience to draw from, will the younger judges be less discriminatory than their elder counterparts? I’m not so sure. During the selection stage, a number of the student teams were able to distinguish the machine from the human with great accuracy and speed and it will be fascinating to see if a generation that has grown up with daily exposure to computers is able to outperform their more experienced counterparts.
The question as to whether the Turing Test is still the benchmark for intelligent machines after so long is an interesting one, and one that is brought into sharper focus with the celebrations of Turing’s centenary due to take place next year at Bletchley Park. The (British) Society for the Study of Artificial Intelligence and the Simulation of Behaviour (AISB) has held symposia for two years running to consider whether there might be an alternative to the traditional type-written Turing Test, with a number of interesting developments. Some recent efforts have focussed on testing AI within virtual environments of varying complexity – The BotPrize requires the AI to play the game Unreal Tournament indistinguishably from a human, and MarioAI, as its name suggests, requires the same playing skill but within the Super Mario Brothers game. A somewhat different approach developed at Exeter, but one that still requires the processing of virtual environment, is the Reference Object Selection Test (again you can try for yourself here – http://www.newscientist.com/article/dn20905-take-the-visual-turing-test.html) where the AI must determine the relationships between objects within a scene. It is clear from these more recent developments that there is a move towards tests with a more visual component, but these tests are also narrower in scope than the original, examining only a small portion of what we might call intelligent behaviour, as they all operate in heavily constrained virtual environments.
For me, as a test for intelligence, broadly considered, the Turing Test remains the benchmark, but the more restricted tests have their role to play too. As constrained tests they can drive the development of better AI by establishing milestones along the road towards true machine intelligence that might be passed in the near future, even if we are still some way off from reaching the final destination.
Posted by Dr Ed Keedwell (Mathematics and Computer Science)