What is channel attribution? Channel attribution modeling

Contents

Introduction

In a typical customer journey 'from thinking to buying', a customer goes through multiple touch points before concentrating on the final product to purchase. This is even more prominent in the case of ecommerce sales.. It is relatively easier to track which are the different points of contact that the customer has found before making the final purchase.

attribution-colby-1200-800x450-6060540

Source: MarTech Today

As marketing increasingly moves to the consumer-driven side of things, Identifying the right channels for target customers has become critical for businesses. This helps companies optimize their marketing investment and target the right customers in the right places..

Most of the time, companies usually invest in the last channel that customers find before making the final purchase. But nevertheless, this may not always be the right approach. There are multiple channels that precede that channel that eventually drive customer conversion.. The underlying concept for studying this behavior is known as “multichannel attribution model”.

In this article, we analyze what channel attribution is and how it relates to the concept of Markov chains. We will also take a case study of an e-commerce company to understand how this concept works., both theoretically and practically (using R).

Table of Contents

  1. What is channel attribution?
    • Markov chains
    • Elimination effect
  2. Case study of an e-commerce company
  3. Implementation in R

What is channel attribution?

Google Analytics offers a standard set of rules for attribution modeling. According to Google, "An attribution model is the rule, or set of rules, which determines how credit for sales and conversions is allocated to touchpoints on conversion paths. For instance, The Last Interaction model in Analytics assigns the 100% credit to final contact points (namely, the clicks) that immediately precede sales or conversions. Conversely, the First Interaction model assigns the 100% from credit to touchpoints that initiate conversion paths “.

We will see the last interaction model and the first interaction model later in this article.. Before that, let's take a little example and understand channel attribution a little more. Let's say we have a transition diagram as shown below:

ca-5997237

In the above scenario, a customer can start their journey through channel ‘C1’ or channel 'C2'. The probability of starting with C1 or C2 is 50% (O 0.5) each. Let's first calculate the overall conversion probability and then go further to see the effect of each of the channels.

P (conversion) = P (C1 -> C2 -> C3 -> Conversion) + P (C2 -> C3 -> Conversion)

= 0,5 * 0,5 * 1 * 0,6 + 0,5 * 1 * 0,6
= 0,15 + 0,3
= 0,45

Markov chains

Markov chains are a process that maps motion and provides a probability distribution to go from one state to another. A Markov chain is defined by three properties:

  • State space – set of all states in which the process could potentially exist
  • Transition operator –The probability of passing from one state to another
  • Probability distribution of the current state – probability distribution of being in any of the states at the beginning of the process

We know the stages we can go through, the probability of moving from each of the paths and we know the current state. This looks like Markov chains, ¿no?

Elimination effect

This is, in fact, an application of Markov chains. We will come back to this later.; let's follow our example for now. If we were to find out what the contribution of the channel is 1 in our client's journey from the beginning to the end of the conversion, we will use the principle of removal effect. The principle of the elimination effect says that if we want to find the contribution of each channel in the customer journey, we can do it by removing each channel and see how many conversions are happening without that channel being in place.

For instance, suppose we have to calculate the contribution of channel C1. We will remove channel C1 from the model and see how many conversions are happening without C1 in the image, namely, full conversion when all channels are intact. Let's calculate for channel C1:

ca1-6850867

P (Conversion after removing C1) = P (C2 -> C3 -> Turn into)

= 0,5 * 1 * 0,6

= 0,3

The 30% of customer interactions can be converted without channel C1 being in place; while with C1 intact, you can convert the 45% of interactions. Then, the removal effect of C1 is

0,3 / 0,45 = 0,666.

The removal effect of C2 and C3 is 1 (you can try to calculate it, but think intuitively. If we eliminate C2 or C3, Could we complete any conversion?).

This is a very useful application of Markov chains. In the above case, all the channels – C1, C2, C3 (in different stages) – are called transition states; while the probability of passing from one channel to another is called transition probability.

The customer journey, what is a channel sequence, can be thought of as a chain in a directed Markov graph where each vertex is a state (canal / contact point), and each edge represents the transition probability of going from one state to another. Since the probability of reaching a state depends only on the previous state, can be considered as a memoryless Markov chain.

Case study of an e-commerce company

Let's take a real life case study and see how we can implement the channel attribution model.

An e-commerce company conducted a survey and collected data from its customers. This can be considered as a representative population. In the survey, the company collected data on the various contact points that customers visit before finally purchasing the product on its website.

Total, there is 19 channels where customers can find the product or product advertisement. After the 19 channels, there are three more cases:

  • # 20 – the customer has decided which device to buy;
  • # 21 – the customer has made the final purchase and;
  • # 22 – the customer has not decided yet.

The general categories of channels are as follows:

Category Canal
Website (1,2,3) Company website or competitor website
Investigation reports (4,5,6,7,8) Industry Advisory Research Reports
Online / Reviews (2) (9,10) Organic searches, forums
Price comparison (11) Aggregators
Friends (12,13) Red social
Expert (14) Expert online or offline
Shops (15,16,17) Physical stores
Misc. (18,19) Others, as promotional campaigns in various locations.

Now, we need to help the e-commerce company identify the right strategy to invest in marketing channels. What channels should I focus on? In which channels should the company invest? We will solve this using R in the next section.

Implementation using R

Let's go ahead and try the implementation in R and verify the results. You can download the dataset here and follow as we go.

#Install the libraries
install.packages("ChannelAttribution")
install.packages("ggplot2")
install.packages("reshape")
install.packages("dplyr")
install.packages("plum")
install.packages("reshape2")
install.packages("markovchain")
install.packages("plotly")

#Load the libraries
library("ChannelAttribution")
library("ggplot2")
library("reshape")
library("dplyr")
library("plum")
library("reshape2")
library("markovchain")
library("plotly")

#Read the data into R
> channel = read.csv("Channel_attribution.csv", header = T)
> head(channel)

Production:

R05A.01 R05A.02 R05A.03 R05A.04 … .. R05A.18 R05A.19 R05A.20
sixteen 4 3 5 N / A N / A N / A
2 1 9 10 N / A N / A N / A
9 13 20 sixteen N / A N / A N / A
8 15 20 21 N / A N / A N / A
sixteen 9 13 20 N / A N / A N / A
1 11 8 4 N / A N / A N / A

We will do a bit of data processing to bring it to a stage where we can use it as input to the model. Later, we will identify which customer journeys have gone into the final conversion (in our case, all tours have reached final conversion status).

We will create a ‘route’ variable in a specific format that can be fed as input to the model. What's more, we will find out the total occurrences of each route using the 'dplyr' package.

> for(row in 1:nrow(channel))
{
  if(21 %in% channel[row,]){channel$convert[row] = 1}
}
> column = colnames(channel)
> channel$path = do.call(paste, c(channel[column], sep = " > "))
> head(channel$path)
[1] "16 > 4 > 3 > 5 > 10 > 8 > 6 > 8 > 13 > 20 > 21 > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > 1"     

[2] "2 > 1 > 9 > 10 > 1 > 4 > 3 > 21 > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > 1"     

[3] "9 > 13 > 20 > 16 > 15 > 21 > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > 1"

[4] "8 > 15 > 20 > 21 > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > 1"

[5] "16 > 9 > 13 > 20 > 21 > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > 1"

[6] "1 > 11 > 8 > 4 > 9 > 21 > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > 1"

> for(row in 1:nrow(channel))
{
  channel$path[row] = strsplit(channel$path[row], " > 21")[[1]][1]
}
> channel_fin = channel[,c(23,22)]
> channel_fin = ddply(channel_fin,~path,summarise, conversion= sum(convert))
> head(channel_fin)

Production:

path conversion
1> 1> 1> 20 1
1> 1> 12> 12 1
1> 1> 14> 13> 12> 20 1
1> 1> 3> 13> 3> 20 1
1> 1> 3> 17> 17 1
> 1> 6> 1> 12> 20> 12 1
> Data = channel_fin
> head(Data)

Production:

path conversion
1> 1> 1> 20 1
1> 1> 12> 12 1
1> 1> 14> 13> 12> 20 1
1> 1> 3> 13> 3> 20 1
1> 1> 3> 17> 17 1
1> 1> 6> 1> 12> 20> 12 1

Now, we will create a heuristic model and a Markov model, we will combine the two and then check the final results.

> H <- heuristic_models(Data, 'path', 'conversion', var_value ="conversion")
> H

Production:

Channel Name first_touch_conversions … .. linear_touch_conversions linear_touch_value
1 130 73.773661 73.773661
20 0 473.998171 473.998171
12 75 76.127863 76.127863
14 34 56.335744 56.335744
13 320 204.039552 204.039552
3 168 117.609677 117.609677
17 31 76.583847 76.583847
6 50 54.707124 54.707124
8 56 53.677862 53.677862
10 547 211.822393 211.822393
11 66 107.109048 107.109048
sixteen 111 156.049086 156.049086
2 199 94.111668 94.111668
4 231 250.784033 250.784033
7 26 33.435991 33.435991
5 62 74.900402 74.900402
9 250 194.07169 194.07169
15 22 65.159225 65.159225
18 4 5.026587 5.026587
19 10 12.676375 12.676375
> M <- markov_model(Data, 'path', 'conversion', var_value ="conversion", order = 1)> M

Production:

Channel Name total_conversion total_conversion_value
1 82.482961 82.482961
20 432.40615 432.40615
12 83,942587 83,942587
14 63.08676 63.08676
13 195.751556 195.751556
3 122.973752 122.973752
17 83.866724 83.866724
6 63.280828 63.280828
8 61.016115 61.016115
10 209.035208 209.035208
11 118.563707 118.563707
sixteen 158.692238 158.692238
2 98.067199 98.067199
4 223.709091 223.709091
7 41.919248 41.919248
5 81.865473 81.865473
9 179.483376 179.483376
15 70.360777 70.360777
18 5.950827 5.950827
19 15.545424 15.545424

Before moving on, Let's first understand what some of the terms we have seen above mean.

First touch conversion: The conversion that occurs through the channel when that channel is the first point of contact for a customer. The 100% credit to first point of contact.

Last touch conversion: The conversion that occurs through the channel when that channel is the last point of contact for a customer. The 100% credit to last point of contact.

Linear touch conversion: All the channels / touchpoints receive the same credit on conversion.

Going back to R code, let's merge the two models and represent the result in a visually appealing way that is easier to understand.

# Merges the two data frames on the "channel_name" column.
R <- merge(H, M, by='channel_name')

# Select only relevant columns
R1 <- R[, (colnames(R) %in %c('channel_name', 'first_touch_conversions', 'last_touch_conversions', 'linear_touch_conversions', 'total_conversion'))]

# Transforms the dataset into a data frame that ggplot2 can use to plot the outcomes
R1 <- melt(R1, id='channel_name')
# Plot the total conversions
ggplot(R1, aes(channel_name, value, fill = variable)) +
  geom_bar(stat="identity", position='dodge') +
  ggtitle('TOTAL CONVERSIONS') +
  theme(axis.title.x = element_text(vjust = -2)) +
  theme(axis.title.y = element_text(vjust = +2)) +
  theme(title = element_text(size = 16)) +
  theme(plot.title=element_text(size = 20)) +
  ylab("")

total-conversions2-3926341

The stage is clearly visible in the graphic above. From the perspective of the first touch conversion, the channel 10, the channel 13, the channel 2, the channel 4 and the channel 9 they are quite important; while from the perspective of the last touch, the channel 20 is the most important (in our case, It should be because the customer has decided which product to buy). In terms of linear touch conversion, the channel 20, the channel 4 and the channel 9 are proving important. From the perspective of total conversions, the channels 10, 13, 20, 4 Y 9 they are quite important.

Final notes

In the graph above we have been able to find out which are the important channels that we should focus on and which can be discarded or ignored. This case gives us a very good idea of ​​the application of Markov chain models in the customer analysis space. Ecommerce companies can now confidently build their marketing strategy and distribute their marketing budget using data-driven information..

Author bio:

This article was contributed by Perceptual analytics. Chaitanya Sagar, Prudhvi Potuganti and Saneesh Veetil developed this article.

Perceptive Analytics provides data analytics, data visualization, business intelligence and reporting services for the e-commerce industries, Retail sale, medical and pharmaceutical care. Our client list includes Fortune listed companies 500 and NYSE in EE. UU. And India.

Subscribe to our Newsletter

We will not send you SPAM mail. We hate it as much as you.