p-charts Don’t Work for Arbitrary Per Cent Data
A colleague shared a chart of patient responses to a question about patient experience:
The dashed red lines are supposed to be control limits, based on p-chart calculations; arithmetic details are in the notes at the end of this post.
If you believe the control limits are correctly derived, there are six months in an “out of control" state, which might kick off detective work to verify causes. The points outside the control limits in January and February 2016, combined with the points outside the control limits on the low side in 2015, might be interpreted as a strong signal of success, as higher per cents correspond to better patient experience.
Do these control limits provide a useful guide to people trying to make sense of the patient experience data?
At best, the control limits suggest a stronger signal of improvement than seems warranted and so may mislead managers and staff in their efforts to improve patient experience.
A Core Issue: Not All Data that are Per Cents Should be Tracked or Analyzed Using a p-chart
p-charts use the binomial probability model to estimate control limits. The limits depend on the counts of observations and an estimate of the probability that patients will respond in one of two ways.
We need to believe that four conditions hold before we can apply the binomial probability model and develop a p-chart in our example:
- We can count individual patients.
- We can classify the response of each patient as one of two distinct answers--in our example, an answer of “Always” or “Not always”.
- Each month, the probability of answering “Always” must be the same for each patient.
- Each month, the response of one patient as “Always” does not affect the likelihood that any other patient will or will not answer “Always”.
As Wheeler (1995) points out, conditions (1) and (2) are typically easy to check; conditions (3) and (4) are usually more difficult to verify in practice. While conditions (1) and (2) appear to hold in the patient experience example, it is not clear that we can, in fact, verify conditions (3) and (4).
It’s easy to think up ways to violate condition (3): mixtures of processes in any specific month with different properties may violate assumption (3)—for example, different work units could treat patients in ways that deliver different service experiences. Or, different groups of patients may have different tendencies to answer “Always” to the experience question.
If all four conditions do not hold, control limits based on the binomial probability model fail to give a sound basis to distinguish signals of special (assignable) causes from common causes of variation. We’re likely to waste time or start believing in causal relations that don’t hold.
What is an Appropriate Analysis of these Patient Experience Data?
“If you cannot be confident about these prerequisite assumptions for the use of a binomial model, then you should avoid the use of theoretical control limits based on the binomial model…In contrast to the theoretical limits of the np-chart and the p-chart, one may still obtain empirical limits by directly measuring the dispersion of the data and constructing limits accordingly. The correct way of doing this with count data is to use an XmR [individuals-moving range] chart. The only restriction on the use of this chart with count data is the restriction …that the average count per ‘sample’ will exceed 1.0. ” (Wheeler, p. 259).
In the next plot, I’ve added control limits (horizontal dashed lines) that are the control limits for an individuals control chart. The estimate of σ comes from the average of the point-to-point absolute differences, scaled by 1.128. The control limits are three times this estimate, added or subtracted from the average. (Wheeler, p. 108).
On the chart, the individuals control chart limits are based on the average per cent (weighted by numbers of observations) used in the first chart, rather than the typical unweighted average.,
Now there are no points outside the individuals chart control limits. At this level of aggregation, we don’t have evidence yet that the system has improved or gotten worse as measured by the patient question .
A run chart provides a simpler alternative to the individuals control chart. The center line is now the median of the monthly per cents, just a bit lower than the average per cent used in the control chart plots.
The last five months above the center line in both the control chart and the run chart, while consistent with an improvement in the patient experience system, don’t exceed the cut-off level for unusual patterns (see this blog post ).
I conclude that the data plots alone do not yet yield strong evidence that the per cent of patients who respond “Always” to the listening question has improved.
Of course, there may be information missing from these simple plots that can help shape our understanding.
If the organization had changed the way staff interacted with patients starting in October 2015—for example, now staff ask an open-ended question at each encounter and then listen, without interruption, to the patient’s response—and the managers had predicted that this change would increase the per cent answers to “Always listens” by 10% from baseline, I would consider that prediction as strong evidence that the managers had indeed shifted the system.
But the prediction has to come before November 2015, not after looking at the series and thinking about what might account for better performance. Otherwise, we are likely to fall victim hind-sight bias, discussed here
Note on p-control Chart Calculations
Here’s the table of numbers that gives the chart at the beginning of this post.
To make the discussion clear, consider the data for September 2015: 710 patients were asked the question, “Have you felt listened to by our staff?”; 372 responded “Always”. That gives a rate of 52%. We can plot a dot at 52% for September, 2015 on a chart, with a center line given by the average of the per cents is 56.2%. If we believe that the binomial assumptions hold, we apply the control limit formulas and determine the upper control limit as 63.6% and lower control limit as 52.5%.
R-code for the plots is in this GitHub repository, https://github.com/klittle314/p_chart_blog_Oct_2016.git
Basic discussion of the binomial distribution may be found in many places, e.g. “Lecture 4: The Binomial Distribution”, Jonathan Marchini, October 25, 2004, http://www.stats.ox.ac.uk/~marchini/teaching/L4/L4.notes.pdf accessed 20 October 2016
D.J. Wheeler (1995), Advanced Topics in Statistical Process Control, SPC Press, Knoxville, TN.