Status: Complete Python Coverage License

📖 Associative Rule Mining¶

🧭 Objective

  • 🛒 What is Association Rule Mining?
  • 📌 Use Cases

📦 Data Setup

  • 🧾 Load Dataset
  • 🧹 Preprocessing / Transaction Formatting
  • 🧮 Frequency Encoding (Optional)

📊 Exploratory Data Analysis

  • 📈 Item Frequency Plot
  • 📉 Itemset Statistics

🧰 Apriori Algorithm

  • ⚙️ Setup Parameters
  • 📜 Generate Frequent Itemsets
  • 🔗 Build Association Rules

🧪 Rule Evaluation

  • 📏 Support, Confidence, Lift
  • 📐 Conviction, Leverage (Bonus)
  • 📋 Top Rules by Metric

🧠 Interpretation

  • 🧭 Business Context for Rules
  • 📉 Redundant Rules / Filtering

📊 Visualizations

  • 🧱 Heatmaps, Matrix
  • 🕸 Network Graphs
  • 📊 Bar Chart of Top Rules

🧭 Objective¶

🛒 What is Association Rule Mining?¶

📖 Click to Expand
🛒 What is Association Rule Mining?¶

Association Rule Mining helps you discover what items are often bought together.
It looks through customer transactions to find patterns like:

  • "People who buy bread and butter also tend to buy jam"
  • "If a user buys a phone, they often buy a screen protector too"

This technique is unsupervised — there's no target variable.
You're just uncovering co-occurrence patterns from raw data.

🧠 Breakdown¶

An association rule is written as:
A → B
...which means “if A was bought, then B is likely to be bought too.”

You can control how strong or interesting a rule is using:

  • Support: How common is this combo?
  • Confidence: If A happens, how often does B follow?
  • Lift: Is this combo better than random chance?

📌 Use Cases¶

📖 Click to Expand
📌 Real-World Use Cases of Association Rules¶

These are not just grocery store tricks. They show up everywhere:

  • 🛒 Retail: Suggest product bundles ("Customers who bought this also bought...")
  • 📩 Email Marketing: Find item combos for targeted promotions
  • 💳 Banking: Detect if certain services (e.g., savings + insurance) are linked
  • 🏥 Healthcare: Identify symptom-diagnosis patterns across patients
  • 🌐 E-commerce: Power recommendation engines for cross-sells

The goal:
Find co-occurrence relationships that help you act smarter — whether you're selling, diagnosing, or personalizing.

Back to the top


📦 Data Setup¶

🧾 Load Dataset¶

In [1]:
# Create a more sophisticated, larger synthetic transaction dataset
import pandas as pd
import numpy as np

np.random.seed(42)

# Define a pool of items and simulate 50 orders
items = ['Milk', 'Bread', 'Butter', 'Eggs', 'Cheese', 'Apples', 'Bananas', 'Juice', 'Cereal', 'Yogurt']
n_orders = 50

# Generate synthetic transactions
transactions = []
for order_id in range(1001, 1001 + n_orders):
    basket_size = np.random.randint(2, 6)  # each order has 2–5 items
    selected_items = np.random.choice(items, size=basket_size, replace=False)
    for item in selected_items:
        transactions.append({'OrderID': order_id, 'Item': item})

df = pd.DataFrame(transactions)
df.head(10)
Out[1]:
OrderID Item
0 1001 Apples
1 1001 Milk
2 1001 Yogurt
3 1001 Bananas
4 1002 Milk
5 1002 Bananas
6 1002 Cereal
7 1002 Bread
8 1002 Yogurt
9 1003 Bananas

🧹 Preprocessing / Transaction Formatting¶

In [2]:
# Convert transaction data to basket-format binary matrix
basket = df.groupby(['OrderID', 'Item']).size().unstack(fill_value=0)

# Binarize: any count > 0 becomes 1 (transactional presence)
basket = basket.applymap(lambda x: 1 if x > 0 else 0)
basket.head(10)
Out[2]:
Item Apples Bananas Bread Butter Cereal Cheese Eggs Juice Milk Yogurt
OrderID
1001 1 1 0 0 0 0 0 0 1 1
1002 0 1 1 0 1 0 0 0 1 1
1003 0 1 0 1 1 0 0 0 0 0
1004 1 0 1 0 1 1 0 0 0 0
1005 1 0 0 1 0 0 0 1 0 1
1006 0 0 0 0 0 0 0 1 1 1
1007 1 0 0 0 1 1 0 0 0 1
1008 0 0 0 1 0 0 0 1 1 0
1009 0 0 0 1 1 1 0 0 0 0
1010 0 1 0 1 0 0 0 0 0 0

🧮 Frequency Encoding (Optional)¶

📖 Click to Expand

Sometimes it's helpful to analyze item frequency — how often each product is purchased — before mining rules.
This helps you:

  • Spot dominant or underrepresented products
  • Filter rare items (e.g., items bought < 2 times)
  • Tune support thresholds more intelligently

Back to the top


📊 Exploratory Data Analysis¶

📈 Item Frequency Plot¶

In [3]:
import matplotlib.pyplot as plt

# Frequency of each item across all orders
item_freq = basket.sum().sort_values(ascending=False)

plt.figure(figsize=(10, 4))
item_freq.plot(kind='bar', color='skyblue')
plt.title("📈 Item Frequency Across Transactions")
plt.ylabel("Count")
plt.xticks(rotation=45, ha='right')
plt.grid(axis='y', linestyle='--', alpha=0.6)
plt.tight_layout()
plt.show()
/var/folders/dg/01ppfw3n6_jbnx4xdx0vdmj40000gn/T/ipykernel_3266/1315152232.py:12: UserWarning: Glyph 128200 (\N{CHART WITH UPWARDS TREND}) missing from current font.
  plt.tight_layout()
/Users/ashrithreddy/anaconda3/lib/python3.11/site-packages/IPython/core/pylabtools.py:152: UserWarning: Glyph 128200 (\N{CHART WITH UPWARDS TREND}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
No description has been provided for this image

📉 Itemset Statistics¶

📖 Click to Expand

This gives you a sense of basket richness — how many items customers typically buy per transaction.
Use this to evaluate if the data is too sparse or too dense before rule mining.

In [4]:
# basket['BasketSize'] = 
# basket['BasketSize'].describe()

mean_basket = basket.sum(axis=1).mean()
min_basket = basket.sum(axis=1).min()
max_basket = basket.sum(axis=1).max()

print(f"🧺 Average basket size: {mean_basket:.2f} items")
print(f"📉 Smallest basket: {min_basket} items")
print(f"📈 Largest basket: {max_basket} items")
🧺 Average basket size: 3.48 items
📉 Smallest basket: 2 items
📈 Largest basket: 5 items

Back to the top


🧰 Apriori Algorithm¶

📖 Click to Expand

Apriori is a classic algorithm that helps you find product combinations that occur frequently in customer orders.

The core idea:

"If smaller combos are rare, bigger combos with those items will also be rare."

Apriori starts with individual items, then builds up to larger combinations only if smaller ones are frequent.
This keeps the search fast and focused.

You'll use Apriori to:

  • Extract common itemsets
  • Feed those into a rule builder
  • Surface patterns like “If A is bought, B follows”

⚙️ Setup Parameters¶

📖 Click to Expand

Apriori requires setting three thresholds to control what gets mined:

  • min_support: How often a product combo occurs
    → "Only show me combos bought in at least 10% of orders"

  • min_confidence: How reliable a rule is
    → "If people buy A, how often do they also buy B?"

  • min_lift: Does the combo actually matter or is it random?
    → Lift > 1 = better than chance

You’ll usually:

  • Use support to filter volume
  • Use confidence to filter risk
  • Use lift to filter noise
In [5]:
# Set thresholds for rule mining
min_support = 0.1       # Appears in at least 10% of orders
min_confidence = 0.5    # At least 50% confidence in the rule
min_lift = 1.2          # Must be better than chance

📜 Generate Frequent Itemsets¶

📖 Click to Expand

Frequent itemsets are combinations of products that occur together more than the minimum support.

For example:
{Bread, Butter} → Support = 0.18 → means it appeared in 18% of all orders

These are the building blocks for generating association rules.

In [6]:
from mlxtend.frequent_patterns import apriori

# Generate frequent itemsets
frequent_itemsets = apriori(basket, # .drop(columns='BasketSize') 
                            min_support=min_support, 
                            use_colnames=True)

# Sort by support to show top combos
frequent_itemsets.sort_values(by='support', ascending=False).head()
/Users/ashrithreddy/anaconda3/lib/python3.11/site-packages/mlxtend/frequent_patterns/fpcommon.py:109: DeprecationWarning: DataFrames with non-bool types result in worse computationalperformance and their support might be discontinued in the future.Please use a DataFrame with bool type
  warnings.warn(
Out[6]:
support itemsets
4 0.46 (Cereal)
2 0.44 (Bread)
9 0.42 (Yogurt)
5 0.36 (Cheese)
0 0.34 (Apples)

🔗 Build Association Rules¶

📖 Click to Expand

Once you have frequent itemsets, you can build association rules like:

If A is bought → B is also likely
Example: {Bread} → {Butter}

Each rule is scored using:

  • Support: How often this combo appears
  • Confidence: How reliable this rule is
  • Lift: How much stronger this rule is compared to random chance

You can filter rules to focus on high-impact patterns.

In [7]:
from mlxtend.frequent_patterns import association_rules

# Build rules from itemsets
rules = association_rules(frequent_itemsets, 
                          metric="lift", 
                          min_threshold=min_lift)

# Filter by confidence
rules = rules[rules['confidence'] >= min_confidence]

# Show top rules
rules[['antecedents', 'consequents', 'support', 'confidence', 'lift']] \
    .sort_values(by='lift', ascending=False).head()
Out[7]:
antecedents consequents support confidence lift
8 (Cereal, Apples) (Cheese) 0.12 0.857143 2.380952
5 (Eggs) (Yogurt) 0.14 0.777778 1.851852
6 (Cheese, Cereal) (Apples) 0.12 0.600000 1.764706
7 (Cheese, Apples) (Cereal) 0.12 0.750000 1.630435
2 (Cheese) (Cereal) 0.20 0.555556 1.207729

Back to the top


🧪 Rule Evaluation¶

📏 Support, Confidence, Lift¶

📖 Click to Expand

These are the core metrics used to evaluate association rules:

  • Support = % of total orders that contain both A and B
    → "How common is this combo?"

  • Confidence = % of orders with A that also have B
    → "If a customer buys A, how often do they buy B too?"

  • Lift = How much more likely B is, given A, compared to random
    → Lift > 1 = meaningful; Lift < 1 = suppressive

You’ll use these to filter useful rules from noise.

In [8]:
rules[['support', 'confidence', 'lift']].describe()
Out[8]:
support confidence lift
count 6.000000 6.000000 6.000000
mean 0.133333 0.682672 1.673901
std 0.035024 0.128973 0.441895
min 0.100000 0.555556 1.207729
25% 0.120000 0.566667 1.313406
50% 0.120000 0.675000 1.697570
75% 0.135000 0.770833 1.830065
max 0.200000 0.857143 2.380952

📐 Conviction, Leverage (Bonus)¶

📖 Click to Expand

These are bonus metrics that add nuance:

  • Leverage = How far the observed support is from expected support (under independence)
    → Bigger = more surprising

  • Conviction = How likely A leads to B, compared to B not happening
    → Conviction of 1 = no info; higher = stronger rule

Most people stick to Support–Confidence–Lift, but these help in fine-tuning or ranking rules.

In [9]:
rules[['leverage', 'conviction']].describe()
Out[9]:
leverage conviction
count 6.000000 6.000000
mean 0.047333 2.221667
std 0.019417 1.233550
min 0.017200 1.215000
25% 0.037400 1.323750
50% 0.049200 1.905000
75% 0.061300 2.497500
max 0.069600 4.480000

📋 Top Rules by Metric¶

📖 Click to Expand

Let’s pull out the most interesting rules based on each metric:

  • High lift: Strongest associations (most meaningful)
  • High confidence: Most reliable predictors
  • High leverage: Most unexpected combos

This helps shortlist rules for action.

In [10]:
def display_top_rules(rules, metric, n=5):
    print(f"\n📌 Top {n} Rules by {metric}")
    return rules.sort_values(by=metric, ascending=False)[
        ['antecedents', 'consequents', 'support', 'confidence', 'lift', metric]
    ].head(n)

display_top_rules(rules, 'lift')
display_top_rules(rules, 'confidence')
display_top_rules(rules, 'leverage')
📌 Top 5 Rules by lift

📌 Top 5 Rules by confidence

📌 Top 5 Rules by leverage
Out[10]:
antecedents consequents support confidence lift leverage
8 (Cereal, Apples) (Cheese) 0.12 0.857143 2.380952 0.0696
5 (Eggs) (Yogurt) 0.14 0.777778 1.851852 0.0644
6 (Cheese, Cereal) (Apples) 0.12 0.600000 1.764706 0.0520
7 (Cheese, Apples) (Cereal) 0.12 0.750000 1.630435 0.0464
2 (Cheese) (Cereal) 0.20 0.555556 1.207729 0.0344

Back to the top


🧠 Interpretation¶

🧭 Business Context for Rules¶

📖 Click to Expand

A rule like {Bread} → {Butter} is just math — until you attach meaning.

Here's how to read it in a business context:

  • Support = 0.18 → This combo shows up in 18% of orders. That’s fairly common.
  • Confidence = 0.6 → If someone buys Bread, they buy Butter 60% of the time.
  • Lift = 2.0 → This is twice as likely as random chance. That’s a strong relationship.

🔍 Now ask:

  • Should you bundle these items in promotions?
  • Is it worth adding cross-sell nudges in your UI?
  • Do these combos vary by store, channel, or segment?

Raw rules are just the start — context makes them useful.

In [11]:
def explain_rules(rules_df, n=5):
    top = rules_df.sort_values(by='lift', ascending=False).head(n)
    summaries = []

    for _, row in top.iterrows():
        ant = ', '.join(list(row['antecedents']))
        con = ', '.join(list(row['consequents']))
        support = row['support']
        confidence = row['confidence']
        lift = row['lift']

        sentence = (f"If a customer buys {ant}, there's a {confidence:.0%} chance they'll also buy {con} "
                    f"(Lift: {lift:.2f}, Support: {support:.0%})")

        summaries.append(sentence)

    for s in summaries:
        print("🧠", s)

# Call the function to generate plain-English summaries for top 5 rules
explain_rules(rules)
🧠 If a customer buys Cereal, Apples, there's a 86% chance they'll also buy Cheese (Lift: 2.38, Support: 12%)
🧠 If a customer buys Eggs, there's a 78% chance they'll also buy Yogurt (Lift: 1.85, Support: 14%)
🧠 If a customer buys Cheese, Cereal, there's a 60% chance they'll also buy Apples (Lift: 1.76, Support: 12%)
🧠 If a customer buys Cheese, Apples, there's a 75% chance they'll also buy Cereal (Lift: 1.63, Support: 12%)
🧠 If a customer buys Cheese, there's a 56% chance they'll also buy Cereal (Lift: 1.21, Support: 20%)

📉 Redundant Rules / Filtering¶

📖 Click to Expand

Many rules repeat the same signal in longer forms. For example:

  • {Bread} → {Butter}
  • {Bread, Milk} → {Butter}

The second one isn't telling you much more than the first — it’s redundant.

You can clean up noise by:

  • Filtering to rules with unique antecedents
  • Removing rules nested inside others (subset/superset logic)
  • Using metrics like conviction or leverage to prioritize surprise

This improves signal quality — especially when presenting to non-technical teams.

In [12]:
# Filter: keep only rules with small antecedents (e.g. 1–2 items)
rules_filtered = rules[rules['antecedents'].apply(lambda x: len(x) <= 2)]

# Optional: drop rules that are subsets of others (basic deduplication)
unique_rules = rules_filtered.drop_duplicates(subset=['antecedents', 'consequents'])

print(f"🧹 Final set: {len(unique_rules)} rules after removing long/duplicated patterns")
unique_rules[['antecedents', 'consequents', 'support', 'confidence', 'lift']].head()
🧹 Final set: 6 rules after removing long/duplicated patterns
Out[12]:
antecedents consequents support confidence lift
2 (Cheese) (Cereal) 0.20 0.555556 1.207729
5 (Eggs) (Yogurt) 0.14 0.777778 1.851852
6 (Cheese, Cereal) (Apples) 0.12 0.600000 1.764706
7 (Cheese, Apples) (Cereal) 0.12 0.750000 1.630435
8 (Cereal, Apples) (Cheese) 0.12 0.857143 2.380952

Back to the top


📊 Visualizations¶

🧱 Heatmaps, Matrix¶

In [13]:
import seaborn as sns
import matplotlib.pyplot as plt

co_matrix = basket.T @ basket # .drop(columns='BasketSize'), .drop(columns='BasketSize')
plt.figure(figsize=(8, 6))
sns.heatmap(co_matrix, cmap="Blues", annot=False)
plt.title("🧱 Item Co-Occurrence Matrix")
plt.xticks(rotation=45, ha='right')
plt.tight_layout()
plt.show()
/var/folders/dg/01ppfw3n6_jbnx4xdx0vdmj40000gn/T/ipykernel_3266/1560166304.py:9: UserWarning: Glyph 129521 (\N{BRICK}) missing from current font.
  plt.tight_layout()
/Users/ashrithreddy/anaconda3/lib/python3.11/site-packages/IPython/core/pylabtools.py:152: UserWarning: Glyph 129521 (\N{BRICK}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
No description has been provided for this image

🕸 Network Graphs¶

📖 Click to Expand

This network graph visualizes association rules as nodes and edges:

  • Nodes = items
  • Edges = directional rule from A to B

Stronger rules = thicker or darker edges.

In [14]:
import networkx as nx

G = nx.DiGraph()

for _, row in rules.head(10).iterrows():
    for antecedent in row['antecedents']:
        for consequent in row['consequents']:
            G.add_edge(antecedent, consequent, weight=row['lift'])

plt.figure(figsize=(8, 6))
pos = nx.spring_layout(G, k=0.5, seed=42)

edges = G.edges(data=True)
weights = [d['weight'] for _, _, d in edges]

nx.draw_networkx(G, pos,
    with_labels=True,
    node_color='lightblue',
    edge_color=weights,
    edge_cmap=plt.cm.Blues,
    width=2.0
)

plt.title("🕸 Association Rule Network")
plt.tight_layout()
plt.show()
/var/folders/dg/01ppfw3n6_jbnx4xdx0vdmj40000gn/T/ipykernel_3266/3258460624.py:25: UserWarning: Glyph 128376 (\N{SPIDER WEB}) missing from current font.
  plt.tight_layout()
/Users/ashrithreddy/anaconda3/lib/python3.11/site-packages/IPython/core/pylabtools.py:152: UserWarning: Glyph 128376 (\N{SPIDER WEB}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
No description has been provided for this image

📊 Bar Chart of Top Rules¶

In [15]:
top_rules = rules.sort_values(by='lift', ascending=False).head(10)
labels = [f"{', '.join(ant)} → {', '.join(con)}" 
          for ant, con in zip(top_rules['antecedents'], top_rules['consequents'])]

plt.figure(figsize=(10, 4))
plt.barh(labels, top_rules['lift'], color='purple')
plt.xlabel("Lift")
plt.title("📊 Top 10 Rules by Lift")
plt.gca().invert_yaxis()
plt.tight_layout()
plt.show()
/var/folders/dg/01ppfw3n6_jbnx4xdx0vdmj40000gn/T/ipykernel_3266/212789155.py:10: UserWarning: Glyph 128202 (\N{BAR CHART}) missing from current font.
  plt.tight_layout()
/Users/ashrithreddy/anaconda3/lib/python3.11/site-packages/IPython/core/pylabtools.py:152: UserWarning: Glyph 128202 (\N{BAR CHART}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
No description has been provided for this image

Back to the top