Está en la página 1de 13

Market Basket Analysis

What is Analytics/Analytical CRM?


Automated extraction of interesting patterns from large databases e.g. Extracts in-depth customer history, preferences and profitability information Allows to analyze, predict and derive customer behavior and forecast demand

Lets you approach customers with relevant information and often that are tailored to their needs

Types of Patterns
Associations
Coffee buyers usually also purchase sugar

Sequence Patterns
After seeing Superman, people usually see Star Wars

Clustering
Segments of customers requiring different promotion strategies

Classification
Customers expected to be loyal

Association Rules
Transaction ID Items 1 D: Tomato, Potato, Onions

2
3 4

Tomato, Potato, Brinjal, Pumpkin


Tomato, Potato, Onions, Chilly Lemon, Tamarind

Rule: Tomato, Potato Onion (confidence: 66%, support: 50%) Support(X) = |transactions containing X| / |D|

Support: It is a measure of how frequently the rule occurs in the database. Support (%) for A=>B is the percentage of all customers who purchased both A and B.

Confidence: Confidence (%) for A=>B is the percentage of all customers who purchased both A and B, divided by the number of customers who purchased A. Ex: A supermarket database has 100,000 point-of-sale transactions. Of these transactions, 2000 include both orange juice and flu medications, and 800 of these include soup purchases. What is the support and confidence for the following rule Orange juice, Flu medication => Soup

Lift: Lift (%) for A=>B is a measure of the strength of the association. If Lift = 2 for the rule A=>B, then a customer having A is twice as likely to have B as a customer chosen at random.
Benchmark Confidence = no. of transactions with consequent item sets/no. of transactions in database Lift Ratio = confidence/benchmark confidence

Class Assignment: Transaction Faceplate Colors Purchased

1 2 3 4 5 6 7 8 9 10

red white green white orange white blue red white orange red blue white blue red blue red white blue green red white blue yellow

List the item sets with support count of at least 20%

Item Set {red} {white} {blue} {orange} {green} {red, white} {red, blue} {red, green} {white, blue} {white, orange} {white, green} {red, white, blue} {red, white, green}

Support (Count) 6 7 6 2 2 4 4 2 4 2 2 2 2

The Process of Rule Selection Rule1:{red,white}=>{green} with confidence = support of {red,white,green}/support of {red,white} = 2/4 = 50%

Rule2:{red,green}=>{white} with confidence = support of {red,white,green}/support of {red,green} = 2/2 = 100% Rule3:{white,green}=>{red} with confidence = support of {red,white,green}/support of {white,green} = 2/2 = 100%
Rule4:{red}=>{white,green} with confidence = support of {red,white,green}/support of {red} = 2/6 = 33%

Rule5:{white}=>{red,green} with confidence = support of {red,white,green}/support of {white} = 2/7 = 29%

Rule6:{green}=>{red,white} with confidence = support of {red,white,green}/support of {green} = 2/2 = 100% If the desired confidence is 70%, only 2nd, 3rd and 6th rules are recommended.

Traditional Measures
Confidence: Likelihood of a rule being true Support:
Statistical significance: Data supports rule Applicability: Rule with high support is applicable in large number of transactions

11

Applications
E-commerce
People who have bought Sundara Kandam have also bought Srimad Bhagavatham

Census analysis
Immigrants are usually male

Sports
A chess end-game configuration with white pawn on A7 and white knight dominating black rook typically results in a win for white.

Medical diagnosis
Allergy to latex rubber usually co-occurs with allergies to banana and tomato

Recommendation Systems People who listen to songs that you listen, have also listened to these other songs People who have bought these books, have also bought these other books

También podría gustarte