Cardinal Path’s response to COVID-19 Cardinal Path is sharing all we know to help marketers during COVID-19.  Learn more.

In our latest webinar, I was joined by Jim Sterne of eMetrics Summit and the Digital Analytics Association to cover a topic that’s gaining more and more traction in marketing and analytics. Artificial intelligence and machine learning are already impacting consumers and marketers, possibly more than you know.

This session was aimed at providing context so you can better understand how these applications fit into your world, and the world around you. Jim and I shared some practical, achievable ways to lift your marketing efforts through AI & machine learning – some of which you may already be doing. It’s proving to be an important part of the toolkit in this digital economy and one that is solving real-world challenges.

If you missed the live session, check out what our audience was eager to know.

Q: How can I use machine learning for customer segmentation and how is this different than other methods?

JS: This is the thing that machines are best at – determining how specific things are alike. For example: Here is everything I know about my customers. What do my best customers have in common? They might all show up at this time of day, or they all stick around for this long, or they all live in a certain zip code. This is important to undertand. These models that the machines are building are not logical, they’re mathematical. So they are segmenting and grouping based on signals that we wouldn’t even consider and might not think are important. And yet, they’re able to group people together. So if you have a segment of who your best customers are, you can go out into the world and find look-alikes for them. This is where we get into supervised, and unsupervised, machine learning.

CB: Unsupervised learning is where you are standing back — it’s a data-mining technique where you’re looking for patterns that exist in the data. The way you would do that is through clustering, which is actually just a calculation. For example for individuals, the calculation could be on the distance between your customers. And then you look for customers who are relatively close together and call that a cluster. That would be an unsupervised technique.

For a supervised technique, what you’re saying is “these are customers that I am certain have purchased in the past.” They are customers that you can use to train an algorithm. And then you apply them to customers whose actions and behaviors you don’t know. So you look for the common elements between the knowns and the unknowns and then use that as your means of classification.That’s a supervised technique because you’re telling your model what the correct answer is, then it’s looking at a new data set to inform you of the two classifications a person could have.


Q: Do you have any recommended book reading lists?

CB: Jim Sterne’s new book Artificial Intelligence for Marketing: Practical Applications is worth a mention here.

I also really enjoyed the book, Data Science for Business. It isn’t a technical read, but gets down to some of the more precise details of how these models work and how they can best be used in a business context.


Q: How could we take traditional media mix modeling (statistical modeling) and turn it into AI?

CB: It is difficult to translate media mix modeling (MMM) because fundamentally, you need to understand the diminishing return curves and then understand how they interact together to come to a point of optimality in terms of optimal return. A better way to look at it is not how you translate MMM into AI. The next step on that question would be to change your modeling technique which would be to use something like attribution. And if you look at the field of attribution, there’s a range of algorithms that are in use that would then determine how you come up with an appropriate answer for a particular scenario.


Q: Could you recommend an example of model that predicts button clicks, from a user coming from our ad, on a landing page? We don’t have CRM data.

CB: Any type of machine learning model could work here. The key is to make sure that the data you have about your users actually contains information that in predictive of that action.

One thing I like to see is a rich campaign tagging taxonomy that allows you to infer demographic/CRM style information without actually having a CRM. For example, if you make a display buy that focuses on in-market male auto-purchasers in the 35 to 44 age group, make sure you are tagging your ads with that information. Then when it comes to make a prediction, you’ve ‘inferred’ demographic characteristics about your users.


Q: My organization has a marketing database with quality data but no programmers with a mandate to exploit this data. What are the resourcing requirements or options for pursuing this?

JS: You can go out and get your data appended. Companies like Experian can help you find out more about the people you know about so you have an even richer data set. There are are smaller startups that will take your data and perform clustering activities for you to give you back answers to the questions that you have.

Q: Do you see marketing automation tools building-in machine learning in the near term or will separate software be needed for at least the next couple years?

JS: Salesforce has Einstein, IBM has Watson, Adobe has Sensei… Virtually every company that has any customer data that’s doing any customer relationship management or behaviour analysis is very aware that these techniques are valuable. I predict that between 3 years to 5 years, we won’t be talking about artificial intelligence anymore. Machine learning, yes. The question “Are you using machine learning?” will become like “Are you using a computer?” It is so valuable and so powerful that it’s going to be pushed down to the operating system and it will be part of how we do computing.

Do i need separate software? Today, it’s still very much being researched. And people are calling their machine learning capabilities something different just to differentiate. But given time, it will become a part of computing.

CB: To address the question about tools in the near term: We’re hearing about vendors doing it now. In the last 6 months, Marketo has been making announcements in that area along with Adobe Sensei and Adobe Campaigns. This is happening now, not in a few years.

Q: Do you have an example or case study of an organization successfully using A.I. for customer support/customer service?

JS: I shows an example called Invoka. 3 Days Blinds is one of their customers that is using their product to route phone calls, to make suggestions to customer service reps, and to figure out advertising. When they started out, they spent a fair amount of time teaching the machine and were subsequently able to cut back on the customer service reps that they hired because the machine was helping the reps. They were so well augmented by the machine that they no longer had to hire more. However, as the Invoka product started informing AdWords and running better advertising, they had to hire more customer service reps because the number of phone calls started to climb. There’s an absolute correlation. The equipment is smarter and helps people be smarter.

Q: How do you decide whether a data set is large enough and of sufficient quality to apply machine learning techniques?

CB: It depends on the scenario. One example of the smallest data set we would feel comfortable using would be a is a monthly forecast we set up for a client, with the monthly forecast based on the past 2 years of data. That is only 24 data points. But we do have the drivers into that forecast so we can see how the drivers are impacting the output. We would of course prefer more data but that’s about the shortest we’ve been able to cut it.

You really have to consider several factors. Firstly, how complex is what you’re trying to forecast? The more complex, the more data you will need. Also, whether it’s a linear vs. nonlinear problem. If you’re working with a nonlinear problem, you’re going to need more data.

Going back to that 2-year forecasting problem: if a major driver of change was not captured in the previous 2 years, let’s say it was captured in previous year 3, then the model isn’t going to work. Because it missed learning about a very important historical example. So that’s a really important factor – does your data set capture a key driver of change? If it doesn’t, you’re not going to have a good model.


Q: You mentioned the importance of having the right data. Part of it is having clean data. Can you recommend where we can learn about data cleansing?

CB: I think there’s two things to consider here: data cleaning and feature engineering. Both go hand in hand. Data cleaning is about obtaining, ‘fixing’ and organizing data in  a way that machine learning algorithms can use the data. You may have data in wrong formats, inconsistent formats, with null or missing values, incomplete, or inaccurate data. Feature engineering is about making new data that machine learning algorithms can learn off of.

Coursera offers a good course in data cleaning.

And there’s a good example of feature engineering in Python.