Data Collection Plan

Data Collection Plan: Learn to Create It In 8 Steps

9 min. read

In the DMAIC framework of the Six Sigma Method, a Data Collection Plan is created during the Measure phase. A project manager who already has a Six Sigma Green Belt training will know that it is a useful tool to focus your efforts on. It is a detailed document that describes the exact steps as well as the sequence that needs to be followed in gathering the data for the given Six Sigma project.

Attend our 100% Online & Self-Paced Free Six Sigma Training.

Free Six Sigma Training - Banner

Green Belt training is a good way to learn more about how a Data Collection Plan fits into the DMAIC process outside of what we will be discussing in this article.

A Data Collection Plan ensures that everyone, working on a Lean Six Sigma project, is on the same page with regard to the data plan. It also ensures that this information is correctly channeled to the right stakeholders in the organization who are going to help us with our data needs. The purpose of the plan is to make sure that the data collected are meaningful and valid and that all relevant data are collected concurrently, the Xs and the Ys.

We need a Data Collection Plan here because we want to be efficient by not wasting resources on collecting data that are irrelevant to the project or not usable. By creating a Data Collection Plan, we can focus our efforts on answering specific questions that have business value. This directed approach with a Data Collection Plan helps us avoid locating and measuring data just for the sake of doing so.

How to Create a Data Collection Plan Six Sigma?

1) Identify the questions that you want to answer

The first step in creating a Data Collection Plan is to identify the questions we want to answer. Our data must be relevant to the project. The entire reason to have a DMAIC project is to improve a process. Hence, these questions should be centered on what the reality of our process is under the current state of affairs or status quo. The best practice is to use the SIPOC diagram as a guide for data collection. We also need to figure out the type of measurements or metrics we want to include.

Check our Six Sigma Training Video

2) Determine the kind of data that is available

The second step in creating a Data Collection Plan is to find out what kind of data is available to collect. What data exists that can give us all the required answers? Sometimes, a particular piece of data can give us multiple answers. Make sure that you make a list of all of the data points that are needed to answer the questions the project is centered on.

3) Determine how much data is needed

The third step in creating a Data Collection Plan is to decide how much data we need. We want to get enough data so that we can see patterns and trends. For each data element on the list, write down how much data is actually needed.

4) Determine how to measure the data

The fourth step in creating a data collection plan is to see how we are going to measure the data. As we all know, data can be measured in different ways: check sheets, survey answers, etc. The way we measure will be dependent upon the type of data we seek.

5) Decide who is going to gather data

The fifth step in creating a data collection plan is to decide who is going to collect the data. Nowadays, the data can also be collected through automated software. We may be required to liaise with the person in charge of the software to ensure the data is available and in the correct format.

6) Determine where the data will be collected from

The sixth step is to check where to collect the data from. It means deciding on the location and/or source of data. The location does not mean any physical location. It is the location within the process. The data collection plan must explicitly specify where in the process data must be collected from.

7) Decide whether to measure a sample or the whole population

The seventh step is to decide whether to sample the data or not. Sometimes it is impractical to measure an entire population of data. In such a case, we then take a sample of data. The question that the project team needs to look into could be: What should be our sampling method and sample size be to make statistically-sound judgments?

8) Determine in what format the data will be displayed

The eighth step is to decide the format of displaying the data. We can display data in many ways such as Pareto Diagrams, Scatter Diagram, etc.

A Sample Plan Template

The typical components of a Data Collection Plan are as follows:


The first thing that needs to be clarified before any effort is done is the purpose. The most common purposes include finding whether a process is stable? Whether a process is capable?

Data Collection Plan/Matrix Introduction

Have a look at the example of a Data Collection Plan or Matrix in the figure below. Let’s try to understand the different elements of the Data Collection Plan here. First, we start with capturing the process name. The name of the process owner is filled in accompanied by their contact information, location and area. The stakeholder who prepares the Data Collection Plan has to sign it off and then the first level, second level, and third level of authorized stakeholders are supposed to sign the off the Data Collection Plan. If the Data Collection Plan runs into multiple pages, you can add the page number. The document number and revision data would need to be mentioned for audit purposes.

Data Collection Plan


  • Process Steps: We have now reached the main section of the Data Collection Plan. The first column of the data collection plan shall capture all process steps which we plan to collect the data for. Remember that we are currently progressing in the Six Sigma MEASURE phase. The objective of the Measure phase is to assess the status quo of a business problem and/or process improvement opportunity with the help of data collection and analysis. In Lean Six Sigma projects, the business problem and/or process improvement opportunity typically relates to the process under consideration. That is why the Data Collection Plan requires us to make a note of relevant process steps in the beginning.
  • Critical to Qualities: The second column of the Data Collection Plan talks about two types of CTQs (Critical to Quality). The first one is Key Process Input Variables (KPIV) and the second one is Key Process Output Variables (KPOV).  A KPIV is a process input that provides a significant impact on the output variation of a process or a system or on the Key Process Output Variable (KPOV) of a product. This means that the KPOV is determined by the KPIV. If the KPIV is held constant, then it would yield a predictable and consistent output. Let’s consider the simple process of preparing a cup of tea. For preparing a cup of tea, hot water, sugar, milk, etc. are the ingredients. The quality and quantity of each of these ingredients for a cup of tea will be the KPIV. The KPOV here would be the taste of tea and/or customer satisfaction after drinking that cup of tea. The CTQ column shall note KPIVs and KPOVs applicable to the respective process step.
  • Metrics and data types: The third column is all for capturing all metrics; applicable to the respective process step. The fourth column shall indicate the data type for the data points collected or to be collected in the context of the relevant process step only.
  • Operational Definition: The fifth column records an Operational Definition for each measurement or metric. An Operational Definition is a document that details exactly how a specific metric will be measured.
  • Specification limits: The sixth column helps us make a note of specification limits for each process step. The quantified specification limits, in terms of upper and lower specification limits (USL and LSL), are noted in the Data Collection Plan for all relevant process steps.
  • The method of data collection: The seventh column records the method of collecting and measuring different pieces of data. We may also need to mention the Unit of Measurement (UOM) for measuring different data types only for relevant process steps.

Data Collection Plan

  • Sample Size: The eighth column refers to the sample size for collecting different data points. As we all know, sample data represent population data. In order to make it happen in reality, we have to choose the right sampling method followed by the right sample size. If it is practically possible to go ahead with population data, then we must do so. If we plan to make use of only sample data, then we must determine both the sampling method and the sample size for collecting a variety of data as needed. In such a case, we might be required to sort and even prioritize our data collection requirements. The ninth column of the Data Collection Plan refers to the frequency of data collection for instance, daily, weekly, bi-monthly or monthly, etc.
  • Process step deliverables: The tenth column has to mention the deliverables for each process step under consideration. It is also important to know who owns those deliverables.
  • Source and location of data: The eleventh column of the Plan records the source for the availability of data accompanied by the location of data. The location of data does not refer to the physical location. It mainly refers to the location within the process. The Data Collection Plan must explicitly specify where in the process data must be collected from.
  • Reporting format: The twelfth column of the Plan talks about the method or format of reporting to the internal and/or external stakeholders. The data collection plan shall explain the format in which the collected data needs to be displayed to internal and/or external stakeholders. Most probably a graphical method is used because it is intuitively easier to use.
  • Standard Operating Procedure: The final column of the Plan shall incorporate Standard Operating Procedure (SOP) document for everybody’s reference

A solid Data Collection Plan will help Lean Six Sigma teams to collect data in the Measure phase of the DMAIC cycle with accuracy, precision, and transparency. With a Data Collection Plan, all the stakeholders will be informed and there will be an opportunity to question some of the Data Collection Plan ideas before the actual data collection begins. Creating a Data Collection Plan avoids teams just jumping in and collecting data at random. A Data Collection Plan is a structured way of stipulating exactly how the project’s data will be collected.

data collection plan