Two weeks ago at the IHI annual Forum, I helped Brian Maskell present the basics of value stream management. The half-day session provided the context and walked through the details of a value stream management tool that Brian calls a Box Score.
Brian shared a picture that shows how to apply value streams.
What’s a value stream? It’s a family of patient care processes with similar flows that includes all the people, equipment, and facilities that support patient care within that family of care processes. The value stream has few or no resources--machines, equipment, people, and departments—that need to be shared with other value streams. The value stream starts where the patient first enters the care process and extends through to when the patient exits the care process.
Once a value stream is defined and planned, you have to operate and improve it. How do you do this? A value stream manager is appointed and is accountable for cost and quality performance. Weekly review of performance measures and a disciplined approach to standard work, problem-solving and targeted improvement projects provide the means to control and improve performance.
In Brian’s view, the Box Score is the fundamental measurement tool for value stream management. The table at left is an excerpt from a Box Score developed for an elective hip and knee replacement value stream.
Typically, it has three batches of measures: operational measures (five to seven quality and efficiency measures); measures of capacity for key resources; and a value stream profit and loss statement. Brian has used the Box Score in six ways with his manufacturing and service clients:
Michael Porter’s work on improvement of health care value starts with a key proposition: care should be organized around conditions or patient characteristics in primary care. See for example Porter and Lee, “The Strategy that will Fix Health Care”, Harvard Business Review, October 2013 (available here)
Porter refers to the core organization of care as an integrated practice unit (IPU). An IPU has specific attributes, as described on the Institute for Strategy and Competitiveness’s website, accessed 18 December 2016:
IPU attributes 1, 4, and 5 describe attributes of a complete value stream, with attention to comprehensive care. Attributes 2, 3 and 7 align with the principle that resources should be dedicated to the value stream and are not shared with other value streams. Attributes 6, 8, 10, and 11 address management practice and accountability. Only attribute 8, a specific recommendation about care management, is not immediately apparent in the value stream organization proposed by Maskell.
I conclude that IPUs look a lot like value stream organizations. If so, the methods of value stream management apply directly to IPUs, providing specific ways to operate and improve IPUs.
John Beasley, MD and his colleagues at the University of Wisconsin Department of Family Medicine and Community Health and the Department of Industrial and Systems Engineering recently completed a research project. They investigated the burden of electronic medical record (EMR) systems on family practice physicians at the University of Wisconsin. This burden is one factor driving physician burn-out and dissatisfaction with medicine as a profession.
They used logs from the EMR, validated by direct observation of physician computer use. The researchers found a consistent pattern of “work after clinic”--time spent during evenings and weekends.
Physicians averaged about 10 hours a week in EMR work after clinic over the three-year period of the study.
This research is notable for two reasons. First, it made novel use of EMR logs to quantify the extent of the work after clinic phenomenon.
Second, it described three specific ideas that could reduce primary care physician EMR work by as much or more time than 10 hours per week of work after clinic :
• Transcription with human assistance (Save 6+ hours each week).
• Paper/verbal order entry (Save 3+ hours each week).
• Automatic Log-in (Save 1+ hour each week)
Dr Chris Hayes has made the case that changes in health care practice are more likely to be adopted if they have relatively high perceived value to patients and providers and at the same time don’t add workload (See Chris’s web site and this 2015 article in BMJ Quality and Safety, )
The perceived value and impact on time depend on each other—changes that reduce work seem likely to have more value to providers than changes that are work neutral or worse, add work.
Chris summarizes the situation with this picture:
By Hayes’ theory, the change ideas proposed in the UW research appear to be highly adoptable and sustainable once adopted. However, the two changes with the biggest impact requires other people to do more work and be paid to do so.
In the current situation, physicians are working after clinic “for free.” Longer-term effects like fatigue and burn-out, which lead physicians to seek less than full-time positions or leave the profession altogether, are diffuse and don’t show up in a regular bi-weekly cost statement. Adding cost for support staff, on the other hand, is easy to recognize and resist in a world of cost management.
So administrators and physician leaders will have to convince themselves of the business case for the package of changes.
How to make the business case?
Use the Model for Improvement: Run tests, starting on a small scale—involve one physician, over one or two days, to iron out logistics. Then test the changes for longer periods of time, with more providers. Measure the impact on physician time using the EMR logs, costs for support staff, and physician perception. Have the administrators and physician leaders observe the tests themselves to inform their decisions.
In the previous post I described an exercise that uses Galton’s Quincunx.
The challenge: maximize the number of results in a five-value range, over three rounds of 20 drops of the Quincunx. Based on the original exercise devised by Dr. Rob Stiratelli, the exercise starts with the “aim” of the Quincunx at the low end of the range. And at the beginning of the second and third rounds, the device has a calibration problem that shifts the aim three or four units.
Last week, we ran the exercise with four teams in a workshop. Groups A, B and C used a simulation I wrote in R and Group D used the modern Quincunx shown at left.
The exercise's participant instruction sheet is available here.
As I stated in the first post, if you can aim the funnel at the center of the desired range, you usually can get at least 16 of 20 values to fall in that range.
All four teams realized on Round 1 that the system was running on the low side of the range and made adjustments to move the aim closer to the center of the range.
However, the hidden change in the aim at the start of Round 2 caused confusion and uncertainty. Teams that had been getting almost all values in the desired range now were getting values outside the range, even though the nominal position of the meter setting was the same.
With some testing and debate, teams recovered and by the end of the second round were again getting results mostly in the desired range.
The change at the start of Round 3 caused the same challenges—initial hesitation and lack of confidence in the relationship between meter setting and output gave way to better performance, after adjustments to the aim.
Team B had the best results, but this seems to be related to a problem in the simulation (see below.)
1. No teams made a run chart of the results and the meter values. They looked at the table of numbers and did their best to look at average results and the impact of the meter setting. Our management exercise followed a 90-minute presentation and practice with run charts. The failure to make run charts is a sobering reminder that it takes repetition and presence of mind to apply improvement methods and tools when under pressure to perform, even in a training-room exercise.
2. No team got perfect results because the Quincunx system as designed is not capable of regularly producing values in a five point range. Team best efforts, management incentives, and public scorecards don’t change the underlying structure.
3. No teams systematically tested the relationship between meter setting and output. Experimentation comes at the cost of possibly poor results. I did not hear any clear discussion of how to test in the face of uncertain outcomes.
4. The teams did not cooperate; no one sent any representatives to other tables to try to learn what strategies seemed promising.
1. The Quincunx device shows that
a. Variation in results arises from variation in funnel position (an input measured by the meter setting) and system structure, represented by the pins.
b. In other words, variation in input and system structure causes the results to vary.
c. If we can study and identify how the changes in inputs and system conditions drive the variation in results, we can work “upstream” to reduce this variation in a cost-effective way.
2. A Model for Common Causes
a. The structure of the pins provides a physical model for what we call “common causes of variation.”
b. The built-in variation of the pins drives variation in results.
c. Each pin contributes a small but meaningful amount of variation to the results.
d. We can describe the variation that results from the pattern of pins--we expect to see a range of plausible values, without systematic patterns.
3. A Model for Special Causes
a. We can assign the specific movement of the funnel to changes in results so this movement will serve as our model for “special causes of variation.” Walter Shewhart, the inventor of control charts, used the term “assignable causes” to indicate that we may be able to assign a specific cause to this class of variation.
b. Movement of the funnel causes variation in results on top of the variation that arises from the pins.
c. The movement of the funnel can be relatively large or small; the size of the movement matches the ease of detecting the movement.
d. As a useful approximation, the total variation in results is composed of variation from movement of the funnel plus the variation contributed by the pins.
4. General Description of “Common Causes of Variation”
Common causes of variation are system conditions and inputs with the following properties:
a. Variation of the common causes drives variation in system results.
b. Variation in the common causes (and hence variation in the results) is built in to the current structure of the system.
c. Each common cause contributes a small portion of the variation in the results.
d. Common causes of variation combine to give a range of plausible values in the results, without systematic patterns or unusual values.
e. In practice, we can mimic “variation without systematic patterns or unusual values” by models of randomness. Also, we define system variation (with respect to a particular type of result) in terms of common causes of variation.
5. General Description of “Special Causes of Variation”
Special causes of variation are system conditions and inputs with the following properties:
a. Variation of special causes leads to variation in system results.
b. The variation in results from special causes is added to the variation that comes from common causes.
c. If the variation in special causes crosses a threshold, we will see patterns of variation or unusually large or small values in the results.
d. In practice, we say we have “evidence of special causes” when we detect patterns of variation or unusual values in system results. Then, using system knowledge, we often can match the variation in the results with variation in system conditions or inputs. In such a case, we say we have identified one or more special causes of variation.
A control chart is the primary tool to help to distinguish common causes of variation from special causes. See for example L.P. Provost and S.K. Murray (2011), The Health Care Data Guide: Learning from Data for Improvement, Jossey-Bass: San Francisco, especially chapters 4-8.
If you don’t have a physical Quincunx device, you can use the quincunx function in the animation package in R, which shows the pin variation:
My R simulation is an R shiny web app that mimics the physical Quincunx, available at https://iecodesign.shinyapps.io/Quincunx_shiny/ .
The code for the simulation is available at https://github.com/klittle314/Stiratelli_quincunx
The Admin-T and Admin-P tabs show a table and plot, respectively, for the simulation results. I do not show these tabs to participants during the simulation.
To generate a value for a given Meter Setting, the system manager clicks the “Tell System to Get Ready!” button and then clicks the “Get value” button.
The meter setting of 30 corresponds to an output value of 48, on the low end of the desired range 48-52.
After generation of 20 values, the meter setting “slips”: if the average of the first 20 values less than 50, meter value is offset three units lower. Otherwise, the meter is offset three units higher. Similarly, after the next 20 values, the meter slips again: if the average of the next 20 values is less than 50, the meter value is offset four units lower. Otherwise, the meter is offset four units higher.
Here’s a graph of 60 results from the Admin-P tab of the web app, with no adjustment to the meter (hence, no adjustment to the aim of the Quincunx funnel.) You can see how the system “center” changes during each set of 20.
If you lose connection to the server in the middle of a simulation, the web app restarts—there is no persistent memory in the version I used last week.
This restart phenomenon accounts for Team B’s good performance on rounds 2 and 3: the laptop used with this group repeatedly lost the connection to the server, so they worked with a system that never experienced a “slip” in the meter—once they had learned that a meter value of 32 was about right, they just could keep that setting and get pretty good results.
To avoid the problem of reset, you can run the simulation locally or edit the code to allow for persistent storage, e.g. https://shiny.rstudio.com/articles/persistent-data-storage.html.