Lean Six Sigma with Python — Chi-Squared Test

Lean Six Sigma with Python — Chi-Squared Test for Driver Allocation Problem
Solve a Driver Allocation Problem with Chi-Squared Test — (Image by Author)

Lean Six Sigma is a method that can be defined as a stepwise approach to process improvements.

In a previous article, we used the Kruskal-Wallis Test to verify the hypothesis that a specific training positively impacts operators' Inbound VAS productivity. (Link)

In this article, we will implement the Chi-Squared Test with Python to understand if transportation delays are due to a bad allocation of drivers.

I. Problem Statement
Transportation delays are due to drivers' allocation issues?
II. Data Analysis
1. Exploratory Data Analysis

Analysis with Python sample data from historical records
2. Perform Cross Tabulation
Summarise the relationship between several categorical variables.
3. Pearson’s Chi-Square Test
Validate that your results are significant and not due to random fluctuation
III. Conclusion

I. Problem Statement

1. Scenario

You are the Inbound Transportation Manager of a small factory in the United States.

Your transportation network is simple, you have two routes:

  • Route 1: coming from your northern regional hub (with difficult road conditions and busy traffic)
  • Route 2: coming from your southern regional hub (with no traffic and a beautiful modern road)

Transportation is managed by an external service provider with a fleet of three trucks (with three different drivers: D1, D2, D3).

Driver Allocation Problem with Chi-Squared Test using Python
Replenishment order process from the request of the factory to driver allocation — (Image by Author)

Replenishment Process

  1. The Factory sends a replenishment order to your ERP
  2. The Southern regional hub receives the order first
  3. If the stock in the southern hub is too low then the order is transferred to the northern hub
  4. ERP sends a pick-up request to the transportation service provider (From Selected Hub to Factory)
  5. The first driver accepting the request is delivering the raw materials to the factory

P.S: As a customer, we do not have any visibility on the process of driver allocation.

When an order is allocated to the northern regional hub the lead time to get the request accepted is 35% higher than the southern hub.

Are there drivers avoiding as much as possible to be allocated to the north route?

We have analyzed the shipments of the last 18 months to build a sample of 269 records.

II. Data Analysis

1. Exploratory Data Analysis

Stacked Bart Charts — (Image by Author)

2. Perform Cross Tabulation

A cross-tabulation of the data can provide some insights and help us to discover a potential pattern in the repartition of driver’s allocation.

Split of shipments by HUB for each driver
Split of shipments (%) per Driver for each HUB

82.65 % of shipments handled by Driver 1 are from SOUTH HUB

Split of shipments (%) per HUB for each Driver

38.89 % of shipments from SOUTH HUB are handled by Driver 1

Menu Stats> Tables > Cross Tabulation and Chi-Square

3. Pearson’s Chi-Squared Test

The first table is called also called a Contingency table. It is used in statistics to summarise the relationship between several categorical variables.

We’ll calculate the significance factor to determine whether the relation between the variables is of considerable significance using the Chi-Squared Test.

p-value is 0.410

Because the p-value >0.05, there is no significant proof that the driver’s allocation is linked to the Hub.


III. Conclusion

Follow me on medium for more insights related to Data Science for Supply Chain.

This analysis helped us to refute our initial feeling that some drivers deliberately avoid the northern hub.

Therefore, we need to perform a deeper root cause analysis to understand why we have a longer lead time to find a driver for replenishment from this hub.

Samir SACI — Data Science for Supply Chain Portfolio

Please feel free to contact me, I am willing to share and exchange on topics related to Data Science and Supply Chain.
My Portfolio: https://samirsaci.com


[1] Pearson’s Chi-Squared Test, geeks for geeks, link
[2] Scheduling of Luxury Goods Final Assembly Lines with Python, Samir Saci, link
[3] Lean Six Sigma Data Analytics with Python — Kruskal Wallis Test, Samir Saci, link



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Samir Saci

Samir Saci


Senior Supply Chain Engineer — http://samirsaci.com | Follow me for Data Science for Warehousing📦, Transportation 🚚 and Demand Forecasting 📈