Documentos de Académico
Documentos de Profesional
Documentos de Cultura
Lets you approach customers with relevant information and often that are tailored to their needs
Types of Patterns
Associations
Coffee buyers usually also purchase sugar
Sequence Patterns
After seeing Superman, people usually see Star Wars
Clustering
Segments of customers requiring different promotion strategies
Classification
Customers expected to be loyal
Association Rules
Transaction ID Items 1 D: Tomato, Potato, Onions
2
3 4
Rule: Tomato, Potato Onion (confidence: 66%, support: 50%) Support(X) = |transactions containing X| / |D|
Support: It is a measure of how frequently the rule occurs in the database. Support (%) for A=>B is the percentage of all customers who purchased both A and B.
Confidence: Confidence (%) for A=>B is the percentage of all customers who purchased both A and B, divided by the number of customers who purchased A. Ex: A supermarket database has 100,000 point-of-sale transactions. Of these transactions, 2000 include both orange juice and flu medications, and 800 of these include soup purchases. What is the support and confidence for the following rule Orange juice, Flu medication => Soup
Lift: Lift (%) for A=>B is a measure of the strength of the association. If Lift = 2 for the rule A=>B, then a customer having A is twice as likely to have B as a customer chosen at random.
Benchmark Confidence = no. of transactions with consequent item sets/no. of transactions in database Lift Ratio = confidence/benchmark confidence
1 2 3 4 5 6 7 8 9 10
red white green white orange white blue red white orange red blue white blue red blue red white blue green red white blue yellow
Item Set {red} {white} {blue} {orange} {green} {red, white} {red, blue} {red, green} {white, blue} {white, orange} {white, green} {red, white, blue} {red, white, green}
Support (Count) 6 7 6 2 2 4 4 2 4 2 2 2 2
The Process of Rule Selection Rule1:{red,white}=>{green} with confidence = support of {red,white,green}/support of {red,white} = 2/4 = 50%
Rule2:{red,green}=>{white} with confidence = support of {red,white,green}/support of {red,green} = 2/2 = 100% Rule3:{white,green}=>{red} with confidence = support of {red,white,green}/support of {white,green} = 2/2 = 100%
Rule4:{red}=>{white,green} with confidence = support of {red,white,green}/support of {red} = 2/6 = 33%
Rule6:{green}=>{red,white} with confidence = support of {red,white,green}/support of {green} = 2/2 = 100% If the desired confidence is 70%, only 2nd, 3rd and 6th rules are recommended.
Traditional Measures
Confidence: Likelihood of a rule being true Support:
Statistical significance: Data supports rule Applicability: Rule with high support is applicable in large number of transactions
11
Applications
E-commerce
People who have bought Sundara Kandam have also bought Srimad Bhagavatham
Census analysis
Immigrants are usually male
Sports
A chess end-game configuration with white pawn on A7 and white knight dominating black rook typically results in a win for white.
Medical diagnosis
Allergy to latex rubber usually co-occurs with allergies to banana and tomato
Recommendation Systems People who listen to songs that you listen, have also listened to these other songs People who have bought these books, have also bought these other books