Hello guys!
Welcome to the first writeup in this series, an introduction to the field of Artificial Intelligence. AI is all around us.
For most of us who are active on social media, we interact with AI every day. AI recommends the videos you see on your YouTube feed, it translates languages when we make use of Google translate. With our android or iPhone, we can access virtual assistants like Siri and Google assistant which are capable of recognizing speech, understanding our intent, and providing us with the right answers or actions.
This blog series was created to provide the absolute beginner with a foundation in AI and answer questions on how to get started on a journey towards becoming an AI developer.
In this blog post, I will be covering some of the important concepts that are necessary for you to build a basic understanding of artificial intelligence. The words Artificial Intelligence usually strikes up vivid imaginations of robots or super-intelligent machines capable of doing amazing and unimaginable things, thanks to the numerous fictional AI movies we have been exposed to. You would probably relate to this if you have seen a movie like EX_MACHINA.
But what exactly does AI mean?
Tracing back to our primary school days in computer science classes, our teachers usually mentioned that computers operate with the principle of garbage in garbage out. This means that computers were basically dumb entities.
Without step-by-step instructions on how to achieve a task, they are incapable of completing the task. In cases where a single step in a sequence of instructions was omitted or written incorrectly, the computer would most likely fail to execute or return the wrong results. Machines, unlike humans, lack the capability to reason, make decisions or learn from mistakes. In an attempt to make computers smarter, artificial intelligence was born.
In this field, there are four foundational major concepts to note. While Artificial Intelligence, Machine Learning, Deep Learning,and Data Science are often confused for each other, they are not the same.
The term Artificial Intelligence was coined in the 1950's and it refers to programs that enable machines to make decisions through reasoning and perform tasks that require human intelligence. An excellent example is a self-driving car like Tesla which displays the capabilities of a seasoned driver. AI equips self-driving cars with the decision-making skills required to avoid obstructions and control the vehicle’s speed and direction with the aim of traveling from one point to another on a road.
Artificial intelligence is subdivided into narrow AI and general AI. Narrow AI exists because an AI program can be trained to excel in only one task. Such AI is capable of producing better performances than human beings in this task.
However, that same program cannot produce reasonable results in another task. For example, the AI program that beat the best chess grandmaster is not capable of performing language translation tasks.
On the other hand, an artificial general intelligence systemis capable of performing different tasks excellently. Human beings are the perfect models of AGI, as we are capable of learning and excelling in different tasks; from playing board games, to translating languages, and driving vehicles and airplanes.
Next up is Machine Learning.
Inspired by the ability of young children to learn about objects through examples, machine learning refers to programs that can learn how to improve their performance on a task by studying examples.
When we were toddlers, our parents would point and tell us,
‘This person is a boy, and the other over there is a girl’
And thereafter we were able to correctly identify different genders by remembering the unique characteristics present in the first examples shown to us. The same technique applies to machine learning.
Let us consider a machine learning program that is trained to predict the prices of different types of houses.
The program is presented with data for several houses and their corresponding prices. This data could include things like the size of the house, its location, number of bedrooms, and so on.
After the AI program or model has been trained with the examples of houses and their prices, the model is tested by allowing it to predict the price of a new house given only information about its features. The trained model uses experience gathered from training to make a prediction of what this new house would cost.
Machine learning has 3 sub-categories.
- Supervised learning
- Unsupervised learning
- Reinforcement learning
Supervised learning refers to machine learning tasks that require training on labeled data, while unsupervised learning involves training a model on unlabeled data.
For example, on the right side of the image above, images of cats and dogs are labeled accordingly and can be used for supervised learning. Whereas, the pictures of animals on the left have no labels, therefore they can be used for unsupervised learning.
Comparison between a supervised learning task and an unsupervised task reveals that the supervised task involves classification or identification of entities according to their labels, while, unsupervised learning identifies clusters of data points based on their features alone.
While working on in a classification task where data points are seperated into categories according to their labels/classes, supervised learning is employed. In unsupervised learning, the data points are investigated to real patterns or clusters which usually represent useful information.
Supervised learning is used to solve two kinds of problems; classification and regression problems.
In regression, the task of the model is to predict the value of a continuous variable whose values could range from 0 to infinity. For example, what will the temperature be tomorrow? The temperature could be anything from an infinite range of values.
In classification, the model predicts if a new data point belongs to one class or another. The number of classes in the task is always finite. For example, will the weather be hot or cold tomorrow? Temperatures within the hot range will be classified as hot weather and those in the cold range as cold weather.
Moving on to reinforcement learning.
Inspired by the action and reward principle, reinforcement learning involves allowing an agent to interact with its environment and then rewarding or punishing it according to its actions so that it learns how to consistently choose actions that lead to a reward.
If you have ever traveled to an area with poor reception, and you wish to make a phone call, it is a common practice to move around waving your phone in the air until you get enough network signal for the call to go through.
In this scenario, we resort to changing positions, and checking the signal to see if our changes led to better signal reception. Using feedback which is the phone signal strength, we continue to move around until we are satisfied. This is the underlying principle behind reinforcement learning systems. Robots are usually trained using reinforcement learning techniques.
The next concept is deep learning.
I mentioned earlier that the perfect model of artificial general intelligence is the human being.
So, researchers decided to study the human brain and attempt to imitate the brain and its processing capabilities.
Deep learning is a subset of machine learning which makes use of the techniques that imitate the neural system of the human brain.
The brain is made up of interconnected neurons which receive signals from the nervous system and the nerves present in every part of the body.
The most basic unit of this system is called the biological neuron. They receive input signals from the dendrites, process them in the nucleus, and send out a response through the axon. An artificial neuron was developed in a similar fashion, with input nodes, a processing center where data is processed using functions and output is sent out through the output node.
To provide a higher-level understanding of the working principle of the artificial neuron, I will use the analogy of learning how to bake a good cake.
If we are going to bake a cake, we will need ingredients like flour, butter, eggs, and sugar. These ingredients are our inputs.
As amateur bakers, we don’t know the right measurements of each ingredient required to give a perfect cake. However, we are willing to try experiments and learn from experience.
In addition to our ingredients, we go ahead and purchase a nice cake from an excellent bakery. This cake will serve as a reference for what our output should look like.
We are going to perform a supervised learning experiment and we have ingredients as input features or data points and the cake from the bakery serves as our label or desired output. Altogether, the ingredients and cake form an example which we can learn from.
For our first try we put each of the ingredients into our mixing bowl using guessed measurements, which could be random.
We decide to use the following measurements for our intial try;
- 200 g of butter
- 200 g of flour
- 150g of eggs
- 100g of sugar
In deep learning, these measurements are called ‘weights’ and they control how much of each input gets processed.
Next, we start mixing the cake ingredients to form a batter. In an artificial neuron, we add the weighted inputs to form a sum.
Then the batter is placed in the pan and set in an oven to bake, during which the batter is transformed into cake. Similarly, the sum of the weighted inputs is sent through a function called activation function and is transformed into output. Once our cake is done, we can taste and compare our cake with the cake from the bakery to get feedback about the success of the process.
If our cake is below standard, then we agree there was some error with the measurements of ingredients we used. Based on the feedback we can deduce which ingredients need measurement adjustment. After making the adjustments, we can retry the process again.
In deep learning, comparing output with desired outcome/label results in an error. This error is used as a feedback signal to adjust the weights of the inputs for a better outcome performance. This process is called backpropagation.
The experiment can be repeated until the output is close enough to our desired outcome. The number of repetitions is usually referred to as iterations.
A deep neural network, which is the algorithm used for building deep learning models consists of a large network of interconnected artificial neurons arranged in layers. Each layer in a neural network consists of neurons that usually focus on extracting a particular type of information from the data. Most computer vision and natural language processing solutions like Google Imagesearch and Google translator were built using deep learning models.
The final concept we will cover is Data Science.
Data science is a discipline that utilizes machine learning, programing, and statistics to derive insights from all kinds of data. In data science, AI is a means to an end, usually involving the advancement of another discipline like marketing, business, or agriculture. Data science is popularly utilized by businesses and corporations to convert their data into insights that yield profits.
It is also important to note that;
- Machine learning is a subset of Artificial Intelligence
- Deep learning is a subset of AI and Machine learning.
- Data Science cuts across Machine learning and Deep learning.
- Both deep learning and Data Science thrive on large amounts of data, often referred to as big data.
Common applications of AI can be found in computer vision, natural language processing, and speech processing. These areas focus on enabling the machine to process and understand images & video, text, and speech or audio data.
Examples include; self-driving cars which rely heavily on computer vision solutions like object detection and localization, chatbots and language translators which rely on natural language processing, and stock prediction problems which are solved using regression algorithms.
While I have been able to explain these concepts at a high level of abstraction, as you start researching and learning more about machine learning, you will be able to explore many of the topics I did not cover.
Thank you so much for reading and I will see you in the next written piece where I will discuss about careers in AI.
If you have any questions or suggestions, please go ahead and leave a comment below or you can contact me through an email. See you at my new post.
Meanwhile, I would like to recommend a couple of videos that would help solidy your understanding of this post and expose you to examples of applications of artificial intelligence. Links to the videos are listed below.
How does artificial intelligence learn by Briana Brownwell
Artificial intelligence and law firms
How AI is making it easier to diagnose disease by Pratik Shah