Artificial intelligence (AI) is not a new concept. Although the term has surged into the public consciousness in the last few years, academics have been studying the field since the 1950s. What has changed is the technology to turn theoretical insights into practical applications. Advances in storage and computing power, and the explosion of data after the worldwide expansion of the Internet, have moved algorithms and automation out of universities and R&D departments and into the mainstream, affecting all parts of society.
This explainer is for politicians and policy professionals who want to understand more about where things currently stand. It includes a short primer on the basic terms and concepts, and a summary of the main issues and challenges in the policy debate.
The explosion of interest in AI has been accompanied by an explosion of buzzwords and jargon. Many of these terms are used interchangeably, and it’s not always clear how they relate to each other. This explainer starts with the basics: first, the broad concept of artificial intelligence; second, some of the main approaches to machine learning; and third, how deep neural networks can be used to handle even very complex problems. These three concepts can be conceived of as subsets of one another (see figure 1).
Figure 1: A Schematic Representation of AI, Machine Learning and Deep Learning
AI is a large topic, and there is no single agreed definition of what it involves. But there seems to be more agreement than disagreement.
Broadly speaking, AI is an umbrella term for the field in computer science dedicated to making machines simulate different aspects of human intelligence, including learning, decision-making and pattern recognition. Some of the most striking applications, in fields like speech recognition and computer vision, are things people take for granted when assessing human intelligence but have been beyond the limits of computers until relatively recently.
The term “artificial intelligence” was coined in 1956 by mathematics professor John McCarthy, who wrote,
The study is to proceed on the basis of the conjecture that every aspect of learning and any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it.
The approach to achieving this ambitious goal has changed over the years. Researchers in the 1950s focused on the direct simulation of human intelligence, for example attempting to explicitly specify the rules of learning and intelligence in computer code. Research today tends to be based on less precisely defined systems that can improve based on past experience, with the aim of building intelligence that can benefit society and help solve human problems, by understanding the principles of human intelligence even if not replicating the exact structure of the brain.
A good place to gain an intuitive understanding of the difference between these two approaches is the progress made by AI researchers on the games of chess and Go. In 1997, IBM’s Deep Blue system beat the then world chess champion, Garry Kasparov. Deep Blue was a rules-based system (sometimes described as hard coded) that took the rules provided by its programmers and used its immense computational capacities to look as far ahead as possible in the set of potential moves and countermoves to weigh up the best course of action.
That approach has been far less successful in Go, a game that has significantly more potential futures and requires a higher level of strategy and intuition. In 2016, DeepMind’s AlphaGo beat the world champion, Lee Sedol. In contrast to Deep Blue, AlphaGo did not start with pre-structured knowledge of the game. It still used immense computational force but learned by developing its own structure for understanding the game, based on previous matches played by humans as well as itself. AlphaGo was subsequently defeated by a new iteration called AlphaGo Zero, which was trained entirely through playing against itself, with no data on human matches.
AlphaGo is an example of an approach to AI known as machine learning (ML). This approach was formalised in 1959 as the field of computer science dedicated to making machines learn by themselves without being explicitly programmed. ML systems progressively improve in specific tasks, based on experience, previous or historical data. In the seminal paper that first defined this term, computer scientist Arthur L. Samuel explained the motivation behind the approach:
There is obviously a very large amount of work, now done by people, which is quite trivial in its demands on the intellect but does, nevertheless, involve some learning. We have at our command computers with adequate data-handling ability and with sufficient computational speed to make use of machine-learning techniques, but our knowledge of the basic principles of these techniques is still rudimentary. Lacking such knowledge, it is necessary to specify methods of problem solution in minute and exact detail, a time-consuming and costly procedure. Programming computers to learn from experience should eventually eliminate the need for much this detailed programming effort.
The basic idea that guides machine learning is that armed with enough data, a programmer can train an algorithm to achieve any goal. Rapid advances in data storage and computational speed, and the curation of better and larger data sets on which to train ML systems, have resulted in significant progress in applying this technique to an ever-growing range of problems.
Each ML application has three main components: representation, evaluation and optimisation. Representation involves picking the right model (for example, neural network or decision tree) to represent knowledge of the problem. Evaluation is the estimate used to tell how good a model is at a certain task while training or testing it (for instance, how many false positives or false negatives the system had). Finally, optimisation means choosing from multiple techniques to improve the model against the evaluation standard chosen in the previous step.
There are also three main approaches to ML: supervised learning, where a system is trained based on human input; unsupervised learning, where a system is trained on data but without human input; and reinforcement learning, where a system improves iteratively based on a system of rewards and punishments.