For years Dr. W. Edwards Deming ran these famous four day seminars, the highlight of which was the Red Bead experiment. He would invite volunteers from the crowd of attendees onto the stage and he would, while himself playing the role of the manager, lead participants through a role-play exercise, adapting the narrative around whatever results were emerging. It was a humorously oversimplified experiment which used statistical theory to demonstrate how easy it can be to blame workers for faults that belong to the system in which they work.
Preface
Your company has acquired a new customer, and as a result is building out a new business unit to service them. You have been promoted to run this division, and budget has been approved for you to hire 6 new FTEs.
White beads
Your new customer needs white beads, and so a shipment is ordered from your supplier. On inspection, it seems the shipment you receive contains 80% white beads, and 20% red beads. Again, the new customer wants white beads only, and will not accept any red beads whatsoever.
The terms
You gather your new team and give them the run down. You inform them that the expectation is for everyone to produce 50 beads each per day. They are all on probation for the first three days, but that’s just standard practice for everyone as directed by HR. You tell them not to worry, as you are confident in your hiring, and although the team is more junior than you would have hoped, you still feel good about who you have.
The Job
Step 1: The Mix
The shipment of beads is poured into a box and mixed around.
Step 2: Production
Each worker is given a paddle which contains 50 holes that will scoop up 50 beads, the prescribed work load. One worker at a time, they will use the paddle to scoop up 50 beads in a single scoop.
Step 3: Inspection
Each worker will then carry their work over to you for inspection, and you will record the number of red beads contained.
Getting started
Day #1
It’s Monday, and time to kick start this new project. One by one, each of the workers takes the paddle, scoops up 50 beads, and presents the beads to you for inspection.
Things get off to a rocky start. While everyone produced their 50 beads worth, they weren’t all white. In fact there were quite a lot of red beads in everyone’s work. You take note of the number of red beads each person produced.
You’re a little surprised, but you remember that this is only the first day and they are still being on-boarded.
Besides, it’s not all bad news. David’s work only included 4 red beads, which isn’t bad for a first day. He’s clearly a rising star, and you give him a small bonus. Tim however brought in 14 red beads. He seems like a nice guy, but you start to wonder if he might have been a bad hire. You try to remember how many references you checked for him.
To ensure expectations are clear, you announce that a maximum of 3 red beads per work load will be permitted going forward.
Day #2
The second day is worse than the first. This isn’t going well and you are starting to get a little worried. Costs are growing faster than revenue. There doesn’t seem to be a culture of execution. People seem too cosy. Too many millennials perhaps.
But why the variation from yesterday? David’s bonus obviously went to his head because he turned in 11 red beads today after his 4 yesterday. Larry however seems to be hitting his stride; 7 red beads, down from 12 the first day. He was your best worker today, and so you give him the bonus this time. Scott has got off to a bad start, producing 9, and then 11 red beads. You put him on a Performance Improvement Plan, and insist that you have a 1:1 together twice daily.
Day #3
Frustratingly, day three shows no improvement again. Scott has responded well to his PIP and put in his lowest count of red beads yet. Larry continues to improve each day, and the two of you go out to lunch together to discuss the bright future you see for him in the company. But 3 of the other 4 just put in their worst day to date.
Now it’s senior management who are getting worried, and they start to pay closer attention to your team. You’re getting pinged on Slack with more questions from them than you had been previously. In an effort to improve the culture across the whole company, they kick off a few new initiatives. They agree on a vision and six core values which get printed onto posters and displayed around the office. They also start offering free lunches in the office, and give everyone a free hoodie with the company logo.
But you fear that unless the fourth day shows substantial improvement, management might close you down. The team also seem worried, so you take them out for pints after work. You tell them you’re confident they have what it takes to knock it out of the park tomorrow.
Day #4
Pretty much no improvement whatsoever. It’s terrible news. Your nurturing of Larry, and tight management of Scott are both paying off, but the team overall is really underperforming.
You’re called into a meeting with the CEO but are relieved to discover it’s just because he has an idea he wants to share with you. He suggests managing out your three worst performers, and getting the three remaining top performers to work double shifts. This should mean the overall results will improve enough to get things going in the right direction.
While you are disappointed to have to let go of three people you hired and have grown quite close to, you are also glad to have an opportunity to show the CEO that you have the backbone to do this job. The strength to make the tough calls when required.
You restructure the team. Scott, Spencer, and Larry are given updated roles, and the other three’s roles are made redundant as a result of the re-org. The CEO seems impressed with how you handled it all.
Day #5
The worst day yet. You’re shocked. And you’re also out of time. Management have decided to pivot in a new direction and reallocate your resources, and so your business unit is wound down. It was a good strategy, but your team’s execution wasn’t good enough, the CEO concludes. The plan to keep it open using only the best people didn’t work.
You want to blame the CEO, after all, the restructuring plan he pressured you into really killed the mood in the team for those who remained. But you also accept the possibility that maybe you just don’t have what it takes to get the best out of a team.
The Results
Ok, so let’s take a look at how things went. If you were to plot the individual results on a control chart, it would look like this:
Control Limits
Using the average number of red beads per workload, and the average overall proportion of red beads, you can calculate the upper and lower control limits. The control limits tell you the variation of the system. In this case, when rounded up, the upper control limit (red line) was 18, and the lower (yellow line) was 1.
If you repeated this experiment 50 times, and plotted the number of times each count of red beads occurred, you would reliably get what’s called a binomial distribution.
Lessons
They key message here of course is that everything that happened came from the process itself, not from the workers. They had nothing to do with it, they just did their job as best they could.
Let’s look at a few specific lessons contained within:
- The system turned out to be stable, and therefore the level of output and variation between workers was predictable.
- All the variation came from the process itself. Differences between the workers, and differences between days for each worker, were all attributable to the system.
- The was no evidence one worker was better than the other.
- Each worker could, under the circumstances, do no better than they did. They had put into the job all that they had to offer.
- Attempts to rank and appraise individual workers was misguided, as it was merely ranking the effect of the process on people.
- Attempts to use financial incentives to improve performance failed because their performance was governed by the process in which they worked.
- Had the manager been more open to seeking out suggestions for improvement from the workers, changes could have been made to improve the system and thus improve everyone’s performance.
- There was no basis for senior management’s theory that the three best workers of the past would continue to be the three best in the future.
- You, as the manager of the team, were also a product of the system as you were aligned with senior management’s philosophy. Your goals were handed down to you, and your rewards dependent on the output of your team.
Conclusion
Dr. Deming famously said that 94% of a company’s troubles, and possibilities for improvement, belong to the system, while only 6% are attributable to the people in it. And the system is the responsibility of management.
But to be clear, that’s not to say nothing can be done about it. Even in such an exaggerated example as this, there were lots of things management could have done to improve the system and improve performance.
For starters, they could have worked with the supplier of beads to try to reduce the proportion of red beads in the incoming material. But they could have also changed the paddle, changed the manager, allowed workers to take a second scoop, found a customer for red beads, or many other things.
But they didn’t. For Dr. Deming, these workers represent so many people all over the world who are handicapped by their system, unable to make improvements. In his book The New Economics he explained how everyone’s performance is a sum of both their individual contribution, and the effect of the system on their performance.
And so even if you are able to accurately assign a value to a worker’s apparent performance (as in the red beads experiment), you would still be left with two unknowns and would thus be unable to solve the equation.
In Deming’s view, most of what we observe comes from the system, and it’s management who are responsible for that system.
Dr. Deming began using the experiment in the early 1980s, and he credits a William Boller of Hewlett-Packard for introducing it to him.
He would do the experiment live, and then just work through the arithmetic on stage with everyone. No matter what results emerged, a believable story could be told to supposedly explain the variations between workers and between days.