How data mining reveals a correlation between diapers and beer sales to boost sales!

H

Learn how a large supermarket used data mining to discover that diapers and beer sales are related, and then changed the product mix to increase sales. We explain in detail how data mining works, the different types of data mining, and what it can do for you. Data mining can be used in many fields, not just marketing, and will play an increasingly important role in the era of big data.

 

You are the store manager of Store A of L-Mart, a large supermarket. Recently, a competitor, H-Mart, has opened nearby, and you’re worried that your store’s sales have dropped significantly. You decide to ask for help from the headquarters. The employee sent by the headquarters takes a tour of the store and asks you to send him three months of sales history. After a few days, he hands you a piece of paper with instructions to change the arrangement of goods. What? Beer next to baby diapers? Beer next to diapers, which are neither tissues nor baby toys, doesn’t make any common sense. But you’re too timid to complain to headquarters, so you decide to rearrange the items as directed. A month later, when you check your sales, you realize that beer sales have increased dramatically, just like the lie.
This is not a fictionalized version of a real-life story. In the past, Walmart in the United States analyzed data and found that diapers and beer sales were correlated, meaning that fathers of children often bought beer when they bought diapers. They were able to take advantage of this by placing diapers and beer together, resulting in a significant increase in beer sales.
Let’s go back to the example of Store A. How did the headquarters staff find this correlation? The answer is data mining. Data mining is the process of extracting valuable information from large amounts of data by identifying connections, similarities, patterns, and so on. The data we accumulate every day is just a pile of numbers on its own, and it needs to be “processed” to make it valuable. A tangible analogy for this is ‘junk art’. Junk art is when you make art out of junk or trash. Beautiful and stunning works of art are created from piles of waste tires, cigarette butts, bottle caps, and other trash that most people wouldn’t even look at. Junk artists sort and combine these items to achieve the color, texture, and shine they’re looking for, and the end result is a work of art. Similarly, a lot of data has no value on its own, but when it is sorted and combined with some intention, something valuable is created.
There are five main types of data mining: class description and class discrimination, cluster analysis, association analysis, outlier analysis, and sequential pattern analysis. When a set of data that is grouped together by some criteria is called a class, finding the characteristics of the given data items is called class description, finding the special characteristics that can divide the data into two groups is called stratification, cluster analysis is used to find new classes, and association analysis is used to find links between data groups. Finding data that deviates significantly from the average is called outlier analysis, and analyzing patterns of behavior over time is called sequential pattern analysis.
The type of data mining used in our example is association analysis. In association analysis, three measures are used to determine the degree of association. The first is ‘support’, which is the percentage of all transactions where two items (A and B) are traded together. The second is “Confidence,” which is the percentage of total transactions of A that involve A and B together. The last is ‘enhancement’, which is the percentage of A and B traded together (confidence) out of the percentage of all trades where B was traded. If the enhancements have a value of 1, they are independent of each other, and if they are greater than 1, they are positively correlated, and if they are less than 1, they are negatively correlated. First, the support is calculated for all items, and then the confidence is calculated for items that have a certain level of support. The data miner decides what the level of support should be, and then calculates the enhancement for the items that are deemed to be somewhat related to each other to find out how they are correlated. This is how we found that diapers and beer are related and positively correlated.
While we’ve only discussed how data mining can be used in marketing, it can be used in a wide range of fields. In finance, it is used for credit scoring, credit card fraud detection, and securities price prediction; in telecommunications, it is used for customer churn prevention, character and pattern recognition, and security management; in healthcare, it is used for disease diagnosis and genetic analysis; in energy, it is used for electricity demand forecasting and resource exploration; and in manufacturing, it is used for new product/new service development, defect prediction, factory automation, inventory and demand management. In addition, it can be used as a way to understand the past or present situation or predict the future from given data.
Nowadays, big data is all the rage around the world. With the ubiquitous use of IT devices, data is constantly being generated. According to one statistic, in 2011 alone, about 2 trillion GB of data was produced and consumed worldwide, and 28 billion GB of data was produced and consumed in Korea. Data mining is necessary to make these data useful instead of being thrown away, and as the Internet-based living environment is expected to develop further in the future, the importance of data mining is expected to grow even more.
Data mining can be used not only to increase sales, but also to identify customer purchasing patterns, conduct customized marketing, improve inventory management efficiency, and develop various strategies to maximize customer satisfaction. Data is the oil of the modern world, and how well you mine and process it can make or break your business, which is why it’s so important to recognize the importance of data mining and take advantage of it.

 

About the author

Blogger

Hello! Welcome to Polyglottist. This blog is for anyone who loves Korean culture, whether it's K-pop, Korean movies, dramas, travel, or anything else. Let's explore and enjoy Korean culture together!

About the blog owner

Hello! Welcome to Polyglottist. This blog is for anyone who loves Korean culture, whether it’s K-pop, Korean movies, dramas, travel, or anything else. Let’s explore and enjoy Korean culture together!