Walmart Sales Forecasting


Overview

This project focuses on predicting Walmart's weekly sales using machine learning models. By leveraging historical data and exploring key features such as seasonality, promotions, and economic indicators, the project aims to improve inventory management, optimize supply chains, and support data-driven promotional strategies.


Dataset

The dataset includes sales data across multiple Walmart stores with over 400,000 training records. Key files:

- features.csv: Economic and holiday-related indicators.

- train.csv: Historical sales records for training the models.

- stores.csv: Metadata about store types and sizes.

- Challenges: Seasonality effects, missing values, and highly imbalanced data.


Methodology

- Data Preprocessing: Addressed missing values, engineered features like holiday weeks, and aggregated sales by stores and departments.

- Models: Trained Linear Regression, Random Forest, and XGBoost models to evaluate performance across different complexities.

- Metrics: Used R², RMSE, and MAE to evaluate the models’ predictive capabilities.


Key Findings

- XGBoost: Achieved the best results with R² = 0.949 and RMSE = 5,161.36, capturing complex patterns and interactions.

- Insights: Holidays significantly impact sales, with seasonal peaks during Thanksgiving and Christmas.

- Impact: Improved sales forecasting enables better inventory planning and reduces stockouts and overstocking.

Contact Me

Let's chat! Your data, my brain - together we can be unstoppable.