Business Intelligence – Solve a Critical Quality Problem

root cause analysis with predictive analyticsThis is a continuation of  Predictive Analytics – Solve a Critical Quality Problem. A BioPharmaceutical Manufacturing company was scrapping about 30% of batches, which is very expensive. The company’s engineers tried to solve the problem with various techiques.

But it was not until they started using predictive analytics (also know as data mining) that they uncovered actionable process improvements. These improvements are predicted to lower the scrap rate from around 30% to 5%.

How were these improvements discovered?

The Data Mining Approach for Root Cause Analysis: Data mining is a broad term used in a variety of ways, in addition to other terms such as “predictive modeling” or “advanced analytics.”

Here, it means the application of the latest data-driven analytics to build models of a phenomenon, such as a manufacturing process, based on historical data. In a nutshell, in the last 10-15 years, there has been a great leap forward in terms of the flexibility and ease of building models and the amount of data that can be utilized efficiently due to advances in computing hardware.

Data mining has changed the world of analytics

… in a good way.

… forever.

Companies that embrace these changes and learn to apply them will benefit.

Data mining begins with the definition and aggregation of the relevant data. In this case, it was the last 12 months of all the data from the manufacturing process, including:

  • raw materials characteristics
  • process parameters across the unit operation for each batch
  • product quality outcomes on the critical-to-quality responses on which they based their judgment about whether to release the batch or scrap it

Once the relevant data were gathered, StatSoft consultants sat down with the engineering team before we began the model building process. This is a critical step and one that you should consider as you adopt data mining.

We asked the engineers questions such as:

  • Which factors can you control, and which ones can you not control?
  • Which factors are easy to control, and which ones are difficult or expensive to control?

The rationale is that data mining is not an academic exercise when applied to manufacturing. It is being done to improve the process, and that requires action as the end result. A model that is accurate but based solely on parameters that are impossible or expensive to tweak is impractical (which is a nice way of saying ― useless).

Empowered with this information, model building is the next step in the data mining process. In short, many data mining model types are applied to the data to determine which one results in the optimal goodness of fit, such as the smallest residuals between predicted and actual values.

Various methods are employed to ensure that the best models are selected. For example, a random hold-out sample of the historical data is used for each model to make predictions. This helps protect against the potential for the model to get very good at predicting one set of historical data to the point at which it is really bad at predicting the outcomes for other batches.

A major advantage of data mining is that you don‘t need to make assumptions ahead of time about the nature of the data and the nature of the relationships between the predictors and the responses. Traditional least squares linear modeling, such as what is taught in Six Sigma classes on the analytic tools, does require this knowledge.

For Root Cause Analysis, most data mining techniques provide importance plots or similar ways to see very quickly which raw materials and process parameters are the major predictors of the outcomes, and, as valuable, which factors don‘t matter.

root cause analysis BI

At this point in the data mining process, StatSoft consultants sat down with the engineering team to review the most important parameters. Typically, there is an active discussion with comments from the engineers such as:

  • that can‘t be
  • I don‘t see how that parameter would be relevant

The conversation gradually transforms over the course of an hour to:

  • I could see how those parameters could interact with the ones later in the process to impact the product quality

Data mining methods are really good at modeling large amounts of data from lots of parameters, a typical situation in manufacturing. Humans are good at thinking about a few factors at a time and interpreting a limited time window of data.

As shown above, the two approaches complement each other, with the results from data mining as important insights about the manufacturing process that can then be evaluated, validated, and utilized by the engineering team to determine:

  • Now, what do we do to improve the process? What are the priorities?

The company then planned to implement Process improvements that are predicted to lower the scrap rate of batches from ~30% to ~5%!

Note: To get from root cause analysis to process improvements, the models were used for optimization (another data mining technique).

Next blog: Considerations for the Application of Data Mining.

Article was first published in Z Consulting’s Elevate Manufacturing newsletter for January 2011.


About statsoftsa

StatSoft, Inc. was founded in 1984 and is now one of the largest global providers of analytic software worldwide. StatSoft is also the largest manufacturer of enterprise-wide quality control and improvement software systems in the world, and the only company capable of supporting its QC products worldwide, with wholly owned subsidiaries in all major markets (StatSoft has 23 full-service offices, on all continents), and its software is available in more than 10 languages.

Posted on March 11, 2013, in Uncategorized. Bookmark the permalink. Leave a comment.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: