Turning Data Into
$23M Revenue
A structured data analysis workflow that identified the most effective promotional strategy for a fast-food chain's new menu item launch, delivering measurable business impact through statistical rigor.
This is a Kaggle dataset exploration intended to showcase my analysis process—from scoping and assumptions to EDA, modeling, and communication. It is not a client case study. All results are for demonstration and are fully reproducible in my GitHub repo↗.
Role
Data Analyst
Methodology
Ask, Prepare, Process, Analyze, Share, Act
Deliverables
Analysis Report, Visualizations, Recommendations
Tools
Python, Jupyter Notebooks, Pandas
My Data Analysis Process
I follow a structured 6-step methodology that ensures every analysis starts with the right questions and ends with actionable business recommendations. You can use the cards to navigate to each step.
Define clear business questions and success metrics
Gather and organize the necessary data sources
Clean and validate data for accurate analysis
Apply statistical methods to test hypotheses
Communicate findings clearly to stakeholders
Turn insights into actionable recommendations
01 · Ask — Define the Problem
To avoid assumptions and keep the project focused, I worked with stakeholders to define the problem through a series of guiding questions.
What are we measuring?
Total sales, average weekly sales, or sales growth across the four-week window.
Which promotion wins?
Identify whether Promotion 1, 2, or 3 generates the greatest impact on sales performance.
Is the difference meaningful?
Determine if observed sales differences are statistically significant or likely due to chance.
Do store factors matter?
Assess how market size or store age influence promotion performance.
What does success look like?
Clarify whether the goal is immediate revenue, long-term adoption, or consistency across markets.
What timeframe guides decisions?
Define the measurement window and how results should inform rollout timing.
By clarifying these questions up front, I aligned the analysis with the company's business objectives: to identify, with confidence, which promotion strategy would yield the highest sales and provide a reliable playbook for scaling the launch across all markets.
02 · Prepare — Gathering and Organizing the Data
With the business questions defined, the next step was to identify the data needed to answer them.
MarketID & LocationID
Unique identifiers for each store (used to join and deduplicate rows).
MarketSize
Categorical: Small, Medium, Large — captures local demand and demographics.
AgeOfStore
Store maturity in years (1–28); used to check lifecycle effects on promotion uptake.
Promotion / Week / SalesInThousands
Promotion label (1–3), week index (1–4), and weekly sales for the new item (USD, thousands).
Sample Data Structure
MarketID | MarketSize | LocationID | AgeOfStore | Promotion | Week | Sales (k USD) |
---|---|---|---|---|---|---|
1 | Medium | 1 | 4 | 3 | 1 | 33.73 |
1 | Medium | 1 | 4 | 3 | 2 | 35.67 |
1 | Medium | 1 | 4 | 3 | 3 | 29.03 |
1 | Medium | 1 | 4 | 3 | 4 | 39.25 |
1 | Medium | 2 | 5 | 2 | 1 | 27.81 |
By preparing the data in this way, I created a reliable foundation for analysis. This step ensured that any differences in sales performance identified later would be attributed to the promotions themselves rather than inconsistencies in the dataset.
03 · Process — Cleaning and Validating the Data
Before diving into analysis, I confirmed the dataset was accurate, consistent, and ready to support valid results.
Validation Checklist
- No missing values across columns
- No duplicate rows detected
- Categoricals standardized (MarketSize, Promotion)
Key Fixes
No substantial cleaning was required. The dataset was well-structured and ready for analysis.
Quick Stats
Even though the dataset was relatively clean, this step is critical. Dirty data can lead to biased or misleading results, especially in statistical testing. By validating the structure upfront, I built confidence that any patterns or differences uncovered later would reflect real trends rather than errors in the data.
04 · Analyze — Exploring and Testing Hypotheses
With the data prepared, I moved into analysis. The first step was exploratory: visualizing sales trends across markets, promotions, and store characteristics.
Exploratory Analysis
Visualizations make complex data scannable, surfacing patterns, trends, and outliers at a glance. These charts reveal key insights about store characteristics and promotion performance.
Store Age Distribution
The majority of stores are relatively new, with a significant proportion falling within the first few years of operation (skewed right distribution).
Average Sales by Market Size
Small and Large markets show higher mean sales, suggesting potential market saturation or other factors influencing sales in Medium markets.
Average Promotion Performance
Sales performance trends across the promotional period, showing baseline patterns before detailed analysis.
Weekly Sales by Promotion
Promotion 2 consistently underperformed across the promotional period, while Promotions 1 and 3 showed stronger performance.
Statistical Testing
Visual patterns alone aren't enough — they could be due to random variation. To confirm whether the differences were real and meaningful, I applied several statistical tests.
Chi-Square Test
Purpose:
Test independence between market size and promotion assignment
Result:
p > 0.05 - No significant association
Interpretation:
Promotions were fairly distributed across markets
ANOVA
Purpose:
Compare means across three promotion groups
Result:
F ≈ 21.95, p < 0.0001
Interpretation:
At least one promotion performed significantly differently
Tukey's HSD
Purpose:
Identify which specific promotions differ
Result:
P1 & P3 > P2 (significant)
Interpretation:
Promotions 1 and 3 both outperformed Promotion 2
The combination of ANOVA, Tukey's HSD, and Chi-Square testing provided statistical confidence that Promotions 1 and 3 consistently outperformed Promotion 2, with no significant difference between the top two performers.
06 · Act — Turning Insights into Action
Analysis is only valuable if it drives decisions. With the findings in hand, I translated statistical evidence into clear recommendations and projected business impact.
Scale Promotions 1 & 3
Both significantly outperformed Promotion 2 with no meaningful difference between them; either campaign can be rolled out chain-wide with confidence.
Reevaluate Promotion 2
Given its consistent underperformance, redesign or discontinue Promotion 2. Conduct brief qualitative research to understand why it failed to resonate.
Targeted Testing of Variations
Run follow-up A/B tests on creative, messaging, and timing within Promotions 1 & 3 to optimize before full-scale rollout.
Monitor Performance Post-Launch
Track weekly sales lift, customer adoption, and repeat purchases; re-run statistical tests periodically to validate sustained performance.
Final Takeaway for Stakeholders
Promotions 1 and 3 are both proven winners. By scaling them and eliminating Promotion 2, the chain can unlock tens of millions in additional revenue while ensuring the new menu item launches with maximum impact.
Business Impact Estimates
Average lift per store (4-week window)
Potential month 1 impact (500 stores)
Long-term gains through accelerated adoption
Conclusion
Beyond showing which promotion performed best, this case study demonstrates how a structured workflow can scale from one dataset to broader business decisions. The process itself is the real takeaway: being adaptable. Tools will evolve, datasets will change, and now LLMs are reshaping how we analyze and communicate insights. What doesn't change is the discipline of asking the right questions, validating carefully, and translating results into clear actions that work across contexts.
Technical Implementation Process
This case study is a narrative demonstration of my data analysis process. Below is a more technical, step‑by‑step breakdown of how I would approach analysis when working in a professional team setting.
Problem Scoping
Meet stakeholders, translate business questions into measurable hypotheses, and define success metrics.
Data Discovery
Inventory available data sources (databases, logs, CSVs, APIs), assess schema, freshness, and access requirements.
Data Extraction
Write repeatable extraction queries or pipelines (SQL, Python scripts, or ETL jobs) to obtain raw data.
Data Cleaning
Deduplicate, normalize categorical values, handle missingness (imputation or removal), and validate ranges/types.
Data Modeling
Create derived features, aggregate at the correct grain, and engineer variables needed for analysis or models.
Exploratory Analysis
Visualize distributions, correlations, and outliers; compute summary statistics and sanity checks.
Formal Testing & Modeling
Choose appropriate statistical tests (t-test, ANOVA, chi-square), build predictive or causal models if needed, and validate assumptions.
Interpretation & Synthesis
Translate statistical results into business-relevant conclusions, quantify uncertainty (confidence intervals, effect sizes), and highlight limitations.
Actionable Recommendations
Prioritize interventions, design experiments or rollouts, and provide concrete next steps with expected impact.
Productionization
If required, operationalize models/pipelines (CI/CD, monitoring, scheduled jobs) and implement dashboards or reports for stakeholders.
Communication & Documentation
Produce clear documentation, reproducible notebooks/scripts, and a concise executive summary for decision-makers.