The artificial brain – demystifying neural networks
Nowadays, they are everywhere. Netflix uses them to determine what your favorite series is. Facebook uses them to recognize and tag its users in images. Google uses them for their translators, to interpret spoken language and even to develop a chess-playing bot which can defeat any of its human or artificial predecessors. They are even used in hospitals, for example, to identify diseases based on medical images, and in the financial sector they have become indispensable. In recent years, they have become the most reputable as well as the most infamous algorithm within the realm of Artificial Intelligence: artificial neural networks. But what exactly are these ‘neural networks’?
An artificial neural network is a type of algorithm that helps computers learn and make decisions. Even though these algorithms have become immensely popular in recent years, the idea behind them is almost as old the computer itself! All the way back in 1943, neuroscientist Warren McCulloch and mathematician Walter Pitts wrote about the idea which would develop into neural networks as we know and value today. It may come as a surprise, but their idea is as genius as it is simple: if we want to build intelligent computers, wouldn’t a great starting point be to try to replicate the workings of our own brains in computers?
In fact, that is exactly what neural networks do: replicating the thinking process to the extent with which we understand it within our brains. Keeping things simplified, it works as follows. The human brain contains nerve cells, also called ‘neurons’. These neurons are cells which can both store and process information. What’s more, one neuron can receive signals from multiple sources. If the total force of these signals exceeds a particular threshold, this neuron becomes activated. The neuron then ‘fires off’ its own signal which can then be received by other neurons to, for example, contract certain muscles. It follows that the neurons are connected to each other and thereby form a network: the neural network of our brain. In one network with a vast amount of neurons, simple building blocks can jointly recognize complex patterns, solve problems and make decisions. This forms the basis of our intelligence.
Given that neurons in themselves work according to a relatively simple principle, McCulloch and Pitts figured that we could simulate this neural network within a computer. This is how we can replicate human intelligence. Let’s take a simple neural network as an example.
Imagine that we show a computer an image of two-by-two pixels which can be either black or white. The ‘brain’ of this computer, a neural network, is focused on detecting chessboard patterns. This implies that there are two options: a black diagonal from the upper left to the bottom right, or a black diagonal from the upper right to the bottom left. This could, for example, be a network comprised of 11 neurons. See the figure below, which depicts such a neural network. We use the four neurons at the beginning of our network to examine the four pixels. They ‘fire’ if they see a black pixel, and don’t ‘fire’ if they see a white one. These neurons are connected to the four neurons in the subsequent layer of the network which are in turn connected to the two neurons in the third layer. The final layer is comprised of one neuron: this one needs to ‘fire’ if the chessboard pattern is recognized, and won’t ‘fire’ if it detects an image without such a pattern.
Example of a neural network that recognizes a chessboard pattern in a four-pixel image.
In the first layer, each neuron recognizes exactly one pixel. In the second layer, the various diagonals are detected by summarizing the signals from the first layer. In the third layer, the signals of the second layer are then combined to form the two possible chessboard patterns. These then activate the final neuron if they ‘fire’. Every subsequent layer of this neural network thereby leads to an increasingly complicated pattern: from one pixel, to diagonals, to a chessboard pattern.
The figure highlights what happens if the neural network receives the pattern depicted above. In the first layer, the first two neurons ‘fire’, and the bottom two do not. After all, these neurons do not see a black pixel at the location they are examining. In the second layer, the first neuron is trained to ‘fire’ if the first two neurons do not ‘fire’: it is looking at a white diagonal. The second neuron in the second layer does ‘fire’ as it is looking at a black diagonal. Similarly, we see that the third neuron in the second layer does ‘fire’ and the fourth does not. As the second and third neurons in the second layer ‘fire’, the first neuron in the third layer receives the indication that the chessboard pattern it is trained to identify is contained within the image. The first neuron in the third layer ‘fires’, and the final neuron is able to make the decision: yes, we see a chessboard pattern here! Using the same reasoning, we see that an image with a black pixel at the bottom left position and white pixels elsewhere would lead to only the first neuron in the first layer ‘firing’. Therefore none of the neurons in the second layer ‘fire’, due to which none of the neurons in the third layer ‘fire’. Lastly, the final neuron also does not ‘fire’ and the decision is made: this is not a chessboard pattern!
In this example, each neuron knows in advance what to look out for in order to ‘fire’. This is typically not predetermined for networks which are used in Artificial Intelligence. In these kinds of networks, part of the intelligence is captured in the fact that neurons learn themselves which patterns they should recognize by interpreting data. The exact workings behind this is dealt with in another chapter. The current network is focused on a relatively simple problem: chessboard pattern or not a chessboard pattern. It therefore consists of only a limited number of layers and neurons. By introducing many more neurons, layers or decision rules, a neural network can be greatly extended and can learn to identify highly complex patterns. It follows that increasingly difficult decisions can be made, making neural networks highly ‘intelligent’ algorithms.
A severe disadvantage of neural networks is, however, that they can become sufficiently complex that it is no longer feasible to understand which patterns are identified by which neurons. In the example above, we could still create a simple sketch, but in practice, neurons are trained automatically by looking at the data. The recognized patterns are sufficiently complex that such an illustration is no longer possible. Furthermore, hundreds or even thousands of neurons and layers are needed to solve a truly complicated problem. To give you an idea of the scale: the neural network within our human brain contains an estimated 100 billion neurons. The brain of an ant, for example, contains ‘only’ around 250 thousand neurons.
The fact that many neurons are needed to make a neural network intelligent is just one of the reasons that it took so long for artificial neural networks to be widely applied. To build a neural network with a computer, the neurons within that network all need to be ‘trained’ to process the correct pieces of information and, in turn, to direct the appropriate neurons. In order to achieve this, advanced computers are needed which, with substantial computing power, can determine how exactly these neurons to work. During the 20th and 21st centuries, computers have become significantly faster and more powerful. Techniques like neural networks are now reaping the benefits of these developments, allowing them to, as of recently, be implemented at scale.
The artificial neural networks which are now being built in computers to replicate intelligence are extremely versatile. By varying the number and type of neurons, the distribution of neurons across the layers and the different connections between the neurons, various types of networks can come to exist, each with their own advantages and disadvantages, potential and limitations. The neural network used by Google to play chess does not look anything like the neural network which contributes to Netflix’s recommendations. Nevertheless, all neural networks have the same origin: at the end of the day, they were built with the intention of replicating the human brain. This could be a somewhat comforting thought. The most successful techniques to teach computers how to think are built around the same techniques which exist within our own ‘simple’, human brain.
You have just read a chapter of our upcoming book on AI stories: A must-read for anyone interested in the exciting field of artificial intelligence, regardless of how much you may or may not already know. Over the following weeks we will continue to post the rest of this chapter as well as more exciting content from the book on all kinds of AI topics and applications, from artificial neural networks to self-driving cars to computer-generated art. Follow us on LinkedIn for more inspiring AI stories!