Everyone talks about AI these days, and it is no different in marketing. This blog post is intended to provide a basic overview of what really drives AI-based marketing, and marketing automation with predictive algorithms and machine-learning.
Understanding the information in this blog is by no means necessary to use AI in your marketing department, but it may add some meat to the bones of the topic.
Knowledge is rarely a burden.
An Introduction To AI Algorithms
When we talk about AI, for the most part, we don’t mean real artificial intelligence like we see in the movies. In most cases, we mean machine learning, or perhaps predictive analytics.
Essentially, this is just a clever use of statistics. Artificial intelligence is broken into two categories: ‘Weak’ or ‘narrow’ AI, and ‘strong’ or ‘general’ AI.
- Weak AI generally refers to a machine learning solution dedicated to one specific task only, such as optimizing the price of a product or the send time of an email. This is not real intelligence, just statistical algorithms that can adapt their behavior as the data they are trained on changes.
- Strong AI, on the other hand, is a flexible and general-purpose system that can think and reason on its own, solving problems it hasn’t been trained to do in advance. Perhaps they could become aware of their own existence too, just like a human being.
Only weak AI exists today, and everything discussed here is on that side of the divide.
That doesn’t mean these systems are less effective in performing their tasks; on the contrary, they can be incredibly impressive. However, you don’t need to worry about marketing AI systems taking over the world – at least not for now.
you don’t need to worry about marketing AI systems taking over the world – at least not for now.
However, even weak AI can still cause problems, such as affecting the outcome of public elections, as we learned through the scandal of Cambridge Analytica and its role in the 2016 US election.
For the remainder of this blog post, we will look only at weak AI solutions that offer self-learning algorithms to solve specific problems for which they have been developed. I also use the terms ‘AI’ and ‘machine learning’ interchangeably.
Key Terms in AI
Before we get to the main topic, let’s cover a crash course in what AI is and how it works. While some readers will already know this, it’s best to have a firm footing before going into more detail. This section is the mile high overview.
Three main technologies have together enabled artificial intelligence:
- Big-data and data-mining
- Predictive analytics
- Machine learning
But what are they?
Big-Data, Predictive Analytics and Machine-Learning
Big data means the collection and analysis of huge collections of information – sometimes trillions of pieces. When examined properly using data-mining algorithms, this enables the detection of almost invisible trends and correlations.
We can, for example, detect patterns in historical data that correlate to certain types of credit card transactions being fraudulent.
Big data means the collection and analysis of huge collections of information – sometimes trillions of pieces
Predictive analytics is about designing algorithms that can detect these patterns in future unknown data. For example, to detect if a current credit card transaction fits such patterns, and then determine if the transaction is likely fraudulent.
Predictive analytics is about designing algorithms that can detect these patterns in future unknown data
Machine learning refers to predictive analytics algorithms that can adapt to changing environmental conditions. These software solutions retrain themselves and learn to make even better predictions in the future.
In effect, they become self-optimizing, or self-learning. Predictive systems can only detect anomalies based on the patterns on which they were originally trained; if something changes in their environment, they are unable to adapt.
With machine learning, the prediction system is continually retrained to make different predictions when new data become available based on the latest types of fraud from the real world.
With machine learning, the prediction system is continually retrained
Think of it this way: With machine learning, the software behavior is based on historical data. This process is called “training” the system. If new data becomes available later, we can retrain the system to make it adapt its behavior to changing conditions.
When a prediction system is retrained often (or even continuously), it continues to adapt its predictions to reflect changes in the outside world. Thus, while they may seem like magic, predictive and machine learning systems are driven by collected, real-world data.
How Do AI Algorithms Learn?
Predictive machine learning algorithms are commonly classified by how they learn. The three main types of learning are supervised, unsupervised, and reinforcement. What’s the difference?
- Supervised learning: the algorithms are trained to do a particular task using historical data.
- Unsupervised learning: insights are found using historical data, even if we don’t know exactly what we are looking for.
- Reinforcement learning: the algorithms are trained by positive or negative experiences, or in other words, using trial and error.
Similarly, AI algorithms are often classified by what they do:
- Classification algorithms predict one of several possibilities, e.g. to determine if a customer is likely to buy a specific product or not.
- Regression algorithms predict a numeric value in any range, e.g. to determine the best price of a product.
- Clustering algorithms predict group similarity, e.g. to find segments of your leads and customers with similar attributes.
Different Algorithms
There are a large number of different algorithms, all suited for different types of problems. This blog post will not outline them all but it mentions a few and explain what they do. These algorithms will be covered in detail in separate blog posts:
Without further ado, here is a popular scientific explanation of data science and some of the more common algorithms used in predictive marketing and machine learning.
Data Science and Technical Terms
In the field of data science, there can be an overwhelming number of technical terms. Some are hard to get your head around, even for analysts! However, most of them are easily understood after some basic concepts are explained.
The fundamental focus of all data analysis is to study how variables are related to each other and how this relationship can be leveraged to predict outcomes
The fundamental focus of all data analysis is to study how variables are related to each other and how this relationship can be leveraged to predict outcomes. Let’s start at the beginning:
- What is a variable?
- What does it mean for two variables to have a relationship?
- What is an observation in data science?
Before diving in to the more data-oriented terms, let’s consider the term ‘observation’. When we collect data for analysis in marketing, we observe how our customers behave.
Observations
We will structure our collected data so the information about one customer has its own row where the characteristics (or attributes) are in different columns. Because every row in our data contains information about a single observed customer, every row is called an observation.
To put it in one sentence, one observed customer equals one observation. Imagine an Excel sheet with rows and columns to get a sense of how this is organized.
Variables
A variable contains information about one attribute of all the observations (customers) being studied, described by one column in the table. If we want to study how fast a sports car can round a track, we will collect data about many separate cars driving that track, or maybe even data about cars on other similar tracks.
Since not all sports cars are exactly the same, we cannot simply measure the time it takes for one car to do the loop. We also want to collect data about which type of tires the cars used, what kind of engine they have, and more. These are attributes of each individual car, and for our purposes, they are the variables we use when trying to explain how long it takes a car to cover the track.
Relationships
A relationship between variables is how they are associated with each other. A sports car’s time around the track will be dependent on which engine it uses, for example. This means there is a relationship between the time it takes to finish the track and the engine variable.
When analyzing the relationship between variables, we aim to set up models based on the data we have collected. We use those models to explain the relationship between variables in more detail. Simply put, a model is like a summary of how variables affect an outcome.
Simply put, a model is like a summary of how variables affect an outcome.
Consider a connect-the-dot exercise for kids.
Each dot is a variable, and the lines children draw between them are the relationships. The model, then, is the shape that begins to emerge. As more dots are connected, the clearer the image becomes, and the more easily we can predict what the drawing represents.
The model changes if dots are closer or farther apart, or grouped in different ways. Understanding the relationships between variables (dots) then helps us to explain them or even predict what will happen next.
Obviously, this is a simplification and real modeling is far more complex; however, it is sufficient for our needs here.
Models can take many forms and can be used for predicting an outcome or to provide context and understanding of current data. When making a model, the variables used are classed as either dependent or independent.
Dependent And Independent Variables
These terms are most often used when studying statistics, but you might hear other terms used for the same concepts. For example, in machine learning, we often call them output variables and input variables.
Let’s look at what some of these mean.
As the terms indicate, a dependent variable relies on or is tied to some other variable. This means that they change if one or more other variables change. Take business revenue, for example, which is dependent on the number of sold units.
The opposite of dependent is independent. Just as it sounds, it is not tied to other variables.
Changes in an independent variable can trigger a change in a dependent variable, but the reverse is not true. For example, if wrinkles are dependent on time passing, then time is the independent variable that can be used to explain wrinkles.
Changes in an independent variable can trigger a change in a dependent variable, but the reverse is not true.
However, time will not stop passing just because you don’t get wrinkles. Therefore, a dependent (or output) variable can be explained or predicted using a set of independent (or input) variables. Which variables are classed as dependent or independent is decided based on the assumptions used to create the model.
Outliers
Another common term in the world of data science is ‘outlier,’ which is an extreme observation.
If you score a hole in one when playing golf, it probably doesn’t reflect how you normally play. Therefore, a hole in one is an outlier to the other normal results.
We may not want to take this score into account when trying to predict what score you will get when playing golf next time because it would cause the prediction to be lower than what could be expected normally, and therefore cause the prediction to be less accurate.
Similarly, you probably don’t want to include outliers when studying sales metrics. If a news story covers one of your products, resulting in a big but temporary jump in sales, that should be treated as an anomaly rather than what could normally be assumed in other years.
Overfitting
The last important term to understand is overfitting. This is defined as an analysis that corresponds too closely to a set of data, and avoiding this is one of the hardest and most crucial things in machine learning.
Why would it be bad to have a model fit the data too closely?
The issue is that when a model is created to explain data, it is trained using a selected set of data points. If the model fits that too closely, it’s unlikely to also work when a new, larger set of data is introduced.
Errors in our predictions are reduced by using a more complex model (one that can describe more aspects of the relationships between variables), but as the complexity increases, the model risks becoming less and less general. This can cause predictions based on new observations to be wrong.
While not being necessary for using AI in marketing, having an overview knowledge of the subject can never harm. Marketing Automation is getting increasingly AI-powered, and the the tasks of CMO’s and digital marketers become much more data-driven. Marketing departments need entirely new skills these days, including advanced marketing automation, AI and data-science.
Specific Algorithms
In coming blog posts, the following algorithms will be explained in more detail on a popular-scientific level: