Turning Data Into
$23M Revenue

A structured data analysis workflow that identified the most effective promotional strategy for a fast-food chain's new menu item launch, delivering measurable business impact through statistical rigor.

This is a Kaggle dataset exploration intended to showcase my analysis process—from scoping and assumptions to EDA, modeling, and communication. It is not a client case study. All results are for demonstration and are fully reproducible in my GitHub repo.

Role

Data Analyst

Methodology

Ask, Prepare, Process, Analyze, Share, Act

Deliverables

Analysis Report, Visualizations, Recommendations

Tools

Python, Jupyter Notebooks, Pandas

My Data Analysis Process

I follow a structured 6-step methodology that ensures every analysis starts with the right questions and ends with actionable business recommendations. You can use the cards to navigate to each step.

01 · Ask — Define the Problem

To avoid assumptions and keep the project focused, I worked with stakeholders to define the problem through a series of guiding questions.

What are we measuring?

Total sales, average weekly sales, or sales growth across the four-week window.

Which promotion wins?

Identify whether Promotion 1, 2, or 3 generates the greatest impact on sales performance.

Is the difference meaningful?

Determine if observed sales differences are statistically significant or likely due to chance.

Do store factors matter?

Assess how market size or store age influence promotion performance.

What does success look like?

Clarify whether the goal is immediate revenue, long-term adoption, or consistency across markets.

What timeframe guides decisions?

Define the measurement window and how results should inform rollout timing.

By clarifying these questions up front, I aligned the analysis with the company's business objectives: to identify, with confidence, which promotion strategy would yield the highest sales and provide a reliable playbook for scaling the launch across all markets.

02 · Prepare — Gathering and Organizing the Data

With the business questions defined, the next step was to identify the data needed to answer them.

MarketID & LocationID

Unique identifiers for each store (used to join and deduplicate rows).

MarketSize

Categorical: Small, Medium, Large — captures local demand and demographics.

AgeOfStore

Store maturity in years (1–28); used to check lifecycle effects on promotion uptake.

Promotion / Week / SalesInThousands

Promotion label (1–3), week index (1–4), and weekly sales for the new item (USD, thousands).

Sample Data Structure

MarketIDMarketSizeLocationIDAgeOfStorePromotionWeekSales (k USD)
1Medium143133.73
1Medium143235.67
1Medium143329.03
1Medium143439.25
1Medium252127.81

By preparing the data in this way, I created a reliable foundation for analysis. This step ensured that any differences in sales performance identified later would be attributed to the promotions themselves rather than inconsistencies in the dataset.

03 · Process — Cleaning and Validating the Data

Before diving into analysis, I confirmed the dataset was accurate, consistent, and ready to support valid results.

549 Rows
No missing values
No duplicates
Categoricals standardized

Validation Checklist

  • No missing values across columns
  • No duplicate rows detected
  • Categoricals standardized (MarketSize, Promotion)

Key Fixes

No substantial cleaning was required. The dataset was well-structured and ready for analysis.

Quick Stats

549
Rows
3
Markets
4
Weeks

Even though the dataset was relatively clean, this step is critical. Dirty data can lead to biased or misleading results, especially in statistical testing. By validating the structure upfront, I built confidence that any patterns or differences uncovered later would reflect real trends rather than errors in the data.

04 · Analyze — Exploring and Testing Hypotheses

With the data prepared, I moved into analysis. The first step was exploratory: visualizing sales trends across markets, promotions, and store characteristics.

Exploratory Analysis

Visualizations make complex data scannable, surfacing patterns, trends, and outliers at a glance. These charts reveal key insights about store characteristics and promotion performance.

Store Age Distribution

14012080400Frequency1-41324-61246-91089-124012-155615-171217-202820-2316Store Age (Years)

The majority of stores are relatively new, with a significant proportion falling within the first few years of operation (skewed right distribution).

Average Sales by Market Size

806040200Sales (K USD)Small57KMedium44KLarge70KMarket Size

Small and Large markets show higher mean sales, suggesting potential market saturation or other factors influencing sales in Medium markets.

Average Promotion Performance

5554535251Week 1Week 2Week 3Week 4Sales (K USD)53.853.453.553.2Week

Sales performance trends across the promotional period, showing baseline patterns before detailed analysis.

Weekly Sales by Promotion

6055504540Week 1Week 2Week 3Week 4Sales (K USD)Promotion 1Promotion 2Promotion 3Week

Promotion 2 consistently underperformed across the promotional period, while Promotions 1 and 3 showed stronger performance.

Statistical Testing

Visual patterns alone aren't enough — they could be due to random variation. To confirm whether the differences were real and meaningful, I applied several statistical tests.

Chi-Square Test

Purpose:

Test independence between market size and promotion assignment

Result:

p > 0.05 - No significant association

Interpretation:

Promotions were fairly distributed across markets

ANOVA

Purpose:

Compare means across three promotion groups

Result:

F ≈ 21.95, p < 0.0001

Interpretation:

At least one promotion performed significantly differently

Tukey's HSD

Purpose:

Identify which specific promotions differ

Result:

P1 & P3 > P2 (significant)

Interpretation:

Promotions 1 and 3 both outperformed Promotion 2

The combination of ANOVA, Tukey's HSD, and Chi-Square testing provided statistical confidence that Promotions 1 and 3 consistently outperformed Promotion 2, with no significant difference between the top two performers.

05 · Share — Communicating Findings

Numbers alone don't drive decisions — clarity does. After completing the analysis, my goal was to translate statistical results into a narrative that executives, marketers, and non-technical stakeholders could easily grasp.

Executive Summary

  • Promotions 1 and 3 both significantly outperformed Promotion 2
  • No significant difference between Promotions 1 and 3
  • Market size and store age did not materially affect performance
C-Suite

Statistical Significance

  • Probability of random chance: < 0.01% (p < 0.0001)
  • Results would replicate across similar markets
  • Confidence level enables decisive action
Analysts

Tailored Messaging

  • Executives: Financial impact and bottom line
  • Marketing: Detailed visuals and performance trends
  • Analysts: Statistical outputs and methodology
All Teams

Visualization Strategy

  • Bar charts: Compare average weekly sales
  • Line charts: Show trends over four weeks
Stakeholders

By packaging the results in clear, audience-specific formats, I bridged the gap between rigorous statistical testing and actionable business decisions.

06 · Act — Turning Insights into Action

Analysis is only valuable if it drives decisions. With the findings in hand, I translated statistical evidence into clear recommendations and projected business impact.

Scale Promotions 1 & 3

Both significantly outperformed Promotion 2 with no meaningful difference between them; either campaign can be rolled out chain-wide with confidence.

Reevaluate Promotion 2

Given its consistent underperformance, redesign or discontinue Promotion 2. Conduct brief qualitative research to understand why it failed to resonate.

Targeted Testing of Variations

Run follow-up A/B tests on creative, messaging, and timing within Promotions 1 & 3 to optimize before full-scale rollout.

Monitor Performance Post-Launch

Track weekly sales lift, customer adoption, and repeat purchases; re-run statistical tests periodically to validate sustained performance.

Final Takeaway for Stakeholders

Promotions 1 and 3 are both proven winners. By scaling them and eliminating Promotion 2, the chain can unlock tens of millions in additional revenue while ensuring the new menu item launches with maximum impact.

Business Impact Estimates

$46K

Average lift per store (4-week window)

$23M

Potential month 1 impact (500 stores)

Long-term gains through accelerated adoption

Conclusion

Beyond showing which promotion performed best, this case study demonstrates how a structured workflow can scale from one dataset to broader business decisions. The process itself is the real takeaway: being adaptable. Tools will evolve, datasets will change, and now LLMs are reshaping how we analyze and communicate insights. What doesn't change is the discipline of asking the right questions, validating carefully, and translating results into clear actions that work across contexts.

Technical Implementation Process

This case study is a narrative demonstration of my data analysis process. Below is a more technical, step‑by‑step breakdown of how I would approach analysis when working in a professional team setting.

01

Problem Scoping

Meet stakeholders, translate business questions into measurable hypotheses, and define success metrics.

02

Data Discovery

Inventory available data sources (databases, logs, CSVs, APIs), assess schema, freshness, and access requirements.

03

Data Extraction

Write repeatable extraction queries or pipelines (SQL, Python scripts, or ETL jobs) to obtain raw data.

04

Data Cleaning

Deduplicate, normalize categorical values, handle missingness (imputation or removal), and validate ranges/types.

05

Data Modeling

Create derived features, aggregate at the correct grain, and engineer variables needed for analysis or models.

06

Exploratory Analysis

Visualize distributions, correlations, and outliers; compute summary statistics and sanity checks.

07

Formal Testing & Modeling

Choose appropriate statistical tests (t-test, ANOVA, chi-square), build predictive or causal models if needed, and validate assumptions.

08

Interpretation & Synthesis

Translate statistical results into business-relevant conclusions, quantify uncertainty (confidence intervals, effect sizes), and highlight limitations.

09

Actionable Recommendations

Prioritize interventions, design experiments or rollouts, and provide concrete next steps with expected impact.

10

Productionization

If required, operationalize models/pipelines (CI/CD, monitoring, scheduled jobs) and implement dashboards or reports for stakeholders.

11

Communication & Documentation

Produce clear documentation, reproducible notebooks/scripts, and a concise executive summary for decision-makers.