Demystifying Artificial Intelligence

Hi, first of all welcome to this blog! In this blog we will focus on explaining populair and brand new Artificial intelligence (AI) techniques and their applications. My name is Stefan and I am currently finishing my master Artificial Intelligence at the University of Amsterdam. Next to that I am a freelance developer focusing on Machine Learning. With Black Deer we want to help companies find out what possibilities AI can offer them. So if you are interested in exploring the possibilities of AI for your business, please get in touch.

So back to this blog post. In this post we will try to demystify AI. What is AI and how does it relate to data science? Although normally 'what is phenomenon X questions' should be easy to answer, this certainly does not hold for AI. According to Wikipedia, AI is 'intelligence demonstrated by machines'. But this raises even more questions, what is AI and what classifies as a machine? And if I write a simple script with a few if statements that makes my computer buy or sell crypto currency based on some indicators, is this then AI?

def is_this_ai():
    if bitcoin.current_price > 10000:
        sell()  
    elif bitcoin.sentiment == 'positive' and daily_change < 0.02:
        buy()  
    else:
        wait(10)

Basically AI is a branch of computer science that dates back to the 1930's when Alan Turing worked on cracking the enigma code used by the German army. He came up with the Turing test, which is used to test a machine's ability to exhibit intelligent behaviour or to be more precise its ability to exhibit human intelligence. After that the field developed, but never really gained traction. Until, the arrival of deep learning in the 2000s. Powered by the rapid increase in computing power, mostly of our graphics processing units (GPUs), deep learning has put AI in the spotlight once again.

Diagram showing the relationship between AI, Machine Learning and Deep Learning

So what is deep learning? Deep learning is a subset of machine learning algorithms, as you can see in the image above (source) in which neural networks are used to learn to perform a task. Ok, a lot of buzz words in one sentence. Let's look at them one at a time. First of all, what is machine learning? This time Wikipedia gives a definition which is actually usable, machine learning is a field of AI that uses statistical techniques to give computer systems the ability to learn from data. So we're basically using math to learn a computer perform a task by showing it a lot of data.

Secondly, neural networks. You've probably came across this term when reading landing pages of basically any new tech startup. Neural networks or to be more precise artificial neural networks (ANNs) are computing systems that are inspired by the neurons and their connections in the human brain. Let's look at the example below (source). The neurons are the purple circles and you can see that they are connected to each other by small arrows. We input for instance an image of a dog into the network and the flow of activations ends up at the right side where the network can classify the image as to whether it contains a dog or a cat. By feeding this network enough images of both cats and dogs and telling it the correct answer, the network can learn to recognize them. So going back to the definition of machine learning, this system is learning from data.

Animation of a neural network processing data

Types of Neural Networks

The rise of deep learning has led to a vast arsenal of different kind of artificial neural networks. Let's have a look at a few of them:

Multilayer Perceptrons (MLPs)

The most basic form of an ANN and the example above is actually an MLP. Every connection has a certain weight associated with it and the input, value in the neuron, gets multiplied by this weight. This is done for all incoming connection and this way the activation of the next neuron is determined. Can be used for basically any problem, but is often outperformed by more specialized networks such as the next two.

Convolutional Neural Networks (CNNs)

Although already introduced before the 2000s, CNN(s) are among the most well known ANNs. The core of these networks is a mathematical operation called a convolution, although the actual operation is the similar cross-correlation operation. This operation is extremely useful for finding patterns in a sequence of data. This is also the reason why these kind of networks are most often used on images, for instance for object detection.

Animation showing how a CNN processes an image

Recurrent Neural Networks (RNNs)

In most ANNs we assume that the next input is independent of the current input, however, this is an ill assumption when we for instance try to predict the next word in a sentence. RNNs can be thought of as having a memory so they can use information about the previous word and the word before that, in order to predict the next word. A special instance of a RNN is a long short-term memory (LSTM), which works with so-called gates. Each cell has an input gate through which it receives information, a forget gate with which it can determine what information to forget, and an output gate to determine what information to pass on. These models are often used for problems in which the sequential nature of the task at hand is important, examples are: stock price prediction, part-of-speech tagging, and sentiment analysis.

AI vs Data Science

A common misconception is that AI and data science are the exact same thing. Although they are similar, they are definitely not the same thing. Data science is about extracting insights from large data sources. Let's say you are working as a data scientist at a retail company and your boss wants you to find out how much of their revenue comes from people owning a member card and whether these people also shop online. She gives you access to all to their databases and off you go. You first use something like SQL to filter the relevant data and join the database of the stores and the webshop so that you have all relevant information in one location. Then you write a script to find out how many percent of the sales records come from members and how many of those members are also in the sales data of the webshop. Job done!

Although this is just one simple example from the large and challenging field that is called data science, it exemplifies the difference between data science and AI. Whereas in AI we are trying to learn a machine to extract patterns from data or to learn a task from data, in data science the people are finding the patterns or the relevant information.

In future posts, we will take all of these models and show you what you can do with them. We will be using popular deep learning frameworks such as PyTorch, TensorFlow and Keras and apart from tutorials we will have showcases in which we demonstrate the use of AI in real world applications. So the purpose of this blog is twofold. First of all, we would like to help interested readers understand deep learning models and their applications. Secondly, we want to show how complex models can help solve real world problems you might not even be aware of. We believe that the gap between science and business is extremely large in the field of AI and we will do our best to make it smaller!