What’s WIP and why is it so important to limit it ?

Today a short post about what WIP is and why it is so important to limit it.

WIP = Work-in-Progress

WIP means “work in progress”. A more production oriented definition is that WIP are partly finished products that are currently in the production process.

So basically WIP just means something you’ve started and not yet finished. Considering a very simplified process, you’d have a backlog containing products, requirements, ideas on which no work has been done yet, then your WIP which are things taken from the backlog and you’ve worked on and the finished tasks / products.

Limit your WIP

Just like computers we are able to multi-task. Or rather we can task switch. This means we can work a little bit on something and before it’s finished switch to something else. Doing this we do not lose all the hard work done on the previous task. We can switch back once the other task is finished. Or once someone calls and asks about the progress of the first task.

So what’s the problem ? Why should we limit the work in progress.

Well, just like computers or machines, when we move from one task to another we need to switch contexts. This means we to figure out where we were before switching to another task in order to be able to resume this task.

It’s like a robot having to work on two production lines. If it is placed in the middle, it can work on them switching from one line to the other. But it needs to move its arm from one line to the other. It costs time (movement time). And the more it switches the more time is spent on moving and the less time on the actual production lines.

For computers, it’s the same. A computer can process e.g. many database requests in parallel. But each time it switches tasks it costs time to reload the previous context.

OK. So basically, the less we switch contexts the less time is lost and the more productive we are. So why do we switch contexts at all ?

Well, there are two main reasons.

First, let’s consider a very simplistic example:

Let’s assume you start in the morning with a task A which should require about 2 days of your time. Now, you started working on task A, when you get an email saying that you need to take care of tasks B as well and it’d be great if you could finish it until noon (which shouldn’t be a problem since it should take you only 2 hours).

So of course you could decide to first finish task A and then handle task B. But you’d then come back to your colleague 2 days later then expected and nobody would understand why it took you over 2 days to handle a task even my grandma could have finished in half a day. So in some cases, you actually need to have some parallel work in progress… So is it good or is it bad ?

Well, it definitely depends ! This is a classic throughput vs. response time question. It is the same question IT guys have to handle when tuning a system. Is it more important whether the response time is good or is the throughput more important. For example, if you have a system where the user can search for some data in a huge database and wants to get all matching results. Response time might be more important, because the user can already have a look at the first few results if they come fast (even if it means that the rest of the results will take longer, because all other users also get a fast response time). On the other hand throughput might be more important, if the user actually needs to know how many entries where found or if the user needs all entries in order to be able to start with her work.

In most cases, you can’t completely ignore response time just to optimize throughput. In the above example, assuming switching between task A and B costs you about 15 minutes, you’d save 15 minutes by finishing A first before starting to work on B. But I doubt anybody will give you medal for saving these 15 minutes at the price of having a colleague not able to work for 2 days because he was waiting for your output.

On the other hand, if you do not get such emails only once a day but 10 times a day and get the second email before you even finished task B, you’ll keep switching tasks. In this case, you do not sacrifice 15 minutes of your time to improve response time but you sacrifice two and a half hours every day just switching tasks. Also the task which was supposed to take 2 days of work effectively took a whole week, because of the interruptions and task switching.

Now, you understand why multi-tasking is required in many environments but if you do not limit it, you’re production will go south and reach freezing temperatures at around 0%.

Another reason for the WIP is that it is sometimes difficult to prioritize things. Or it is difficult to figure out what the consequences of a delay could be. So you just hope that by finishing everything a little bit late, you won’t get slapped as bad as if you deliver a few things right on time but a few others very late. Who would be mad at you for needing a few more hours to complete something ? It’s definitely not as bad as finishing something 2 days late…

Well the problem here is that you’re probably just so focused on trying the avoid the big slap that you’re not even trying to figure out how to avoid the situation altogether. It’s like knowing there will be banana peals on the floor when you get into the office and coming everyday with a helmet… Maybe it’d make sense to move the cage of the chimpanzee to another location…

That’s what lean is all about. It’s not about avoiding negative consequences. It’s about making the problem apparent (by showing that the WIP limit keeps being exceeded), finding the root cause and improving the process to eradicate the cause.

How much WIP is good?

That’s probably the next question which pops up. The standard definitive answer for such question is: Well… it depends.

Task switching is an issue because you need to get back to the previous context. But it’s of course not as bad if you still approximately remember where you were than if you have to reread that long email, you started writing because you don’t even remember what you were writing about.

There are people who are better at this than others. I guess it’s also something you learn. After working on many customer issues (crashed systems and such) in parallel for a few months, I felt it wasn’t so difficult switching for a short time (to answer a quick question) and come back. Well, after spending a few years doing the opposite i.e. working on topics which were planned in advance and on which I worked for 1 or 2 days in a row without being interrupted, I got back to a position where I had to handle all those customer issues. And it was really difficult in the beginning to be switching tasks again every 5 minutes. Now after a few years, it’s better again but still I find it not so easy to recall what I was doing after shortly switching to another task. The difference is maybe that I’m five years older and my brain starts thinking about getting pensioned…

So it depends on your ability to switch. Basically, nobody can tell you how much is good maximum number of parallel tasks for you. You have to figure it out based on your experience. You can also set some limit and see how it works. Then you can adapt it to a value you feel gives you enough flexibility and still allows you to work in productive way.

Why do I keep exceeding the limit?

So ideally, if you have set a WIP limit of 3, you should not get to the limit and still need to quickly take up an additional task. If you keep exceeding your WIP limit, it can only mean either of these:

  • Your WIP limit is to low (setting it to one does send a clear message but is not that realistic…).
  • Your process needs some fixing and improvement because it keeps driving you back to this place where all know you’ll fail.

If you already have a WIP limit that actually matches what you can do in parallel and still do a good job, you’ll have to perform some root cause analysis. This means it’s time to reflect on the way you and your organization work i.e. your process (drinking a cup of tea or a beer depending on which continent you live). There are many methods to do this in a structured and efficient way. If you feed Google some of these terms, you should be able to get a few good ideas:

  • Five Whys method
  • Pareto principle
  • Fishbone Diagram
  • Change Analysis

Kanban board design

Kanban boards are by nature very different and have to be adaptable (an introduction to Kanban can be found in this previous post). You will find many example of board designs. And that is exactly what those are. Just examples. The way your board will look like will most probably be unique to you. Since the board is supposed to visualize the way you work now, there is no standard, no best practice for the board itself. If the process you live is closely align to a publicly documented process, then your board will look similar to boards used by others working with the same process. But even then, they won’t be 100% the same. And since Lean and Kanban are all about improving your process, the board design will also evolve along. What you see on the wall today, might not be the same as what you will see tomorrow and most probably (and hopefully) not the same as what you will see a year from now.

So it doesn’t really make sense to take over a board defined by someone else. But seeing other boards can give you some ideas of how to improve a board (especially if it is a virtual one).

The classical Kanban board you’ll find in any introduction to Kanban, looks like this:

To Do In Progress (0/5) Done

It has three columns:

  1. the first column contains the tasks which haven’t been started yet. Often, this column has no WIP limit.
  2. the second column contains the tasks that are being worked on. This column contains a WIP limit (shown above: currently 0 tasks in progress and a WIP limit of 5).
  3. the last column contains all tasks which are finished. Here you usually also do not have a WIP limit.

But this is only the most simple board you could have and a good basis to learn Kanban. But it will in many cases not be sufficient since your process might be more complex or not be mappable to this.

When using Kanban in the context of a software development project, it might more look like this:

Define Develop (0/10) Code Review (0/5) Test (0/10) Communicate

So the number of columns and which columns you have are strictly defined by the process you run and you cannot copy them from somewhere else. So Kanban doesn’t force you to use a specific board layout. But to use Kanban you need to move to a Pull system. Of course, if your current process uses a push model, you have a conflict. So does it mean you do have to change your process before doing Kanban. Well, that’s not entirely true. Here a simple example:

  • Whenever John gets a new request for feature, he puts a card in the Define column of the board above.
  • Whenever John has analyzed such a feature and thinks it’s well specified enough that it can be implemented, he pushes it to the Develop column.

We have here a typical Push system. You do not know whether things in the Develop column are being worked on or are just ready to be worked on. It’s difficult to see whether the number of features analyzed by John is too high and they cannot be developed as fast as they are moved from Define to Develop or whether developers can still handle it.

So what we need here is a separation of two states in the Develop column:

  1. Things which are defined and can be developed.
  2. Things which are being developed.

By also setting a WIP on these two parts, you can notice if John is defining faster that it is being developed, whether developers feel bored or whether development is current keeping pace with feature definition.

The board could look like this:

Define Develop Code Review (0/5) Test (0/10) Communicate
Queue (0/10) In Progress (0/5)

Whether the two sub columns in Develop are represented as columns or as swim-lanes (as shown above) doesn’t matter.
Of course you could also do the opposite: Add a “Done” sub-column in the Define column instead:

Define Develop (0/5) Code Review (0/5) Test (0/10) Communicate
In Progress Done (0/10)

The difference is that if you define an overall WIP limit, it will be considered either for John in the latter board or for developers in the former board.

So like this we’ve converted our Push system to a Pull system. But you may ask what’s the point of all this, if the Pull system is just the same as the Push system with an additional column. Well, the point is just that it is now easy to see where our bottlenecks are. Just this result is worth spending some time adding extra columns.

Now, that we’ve talked about swim-lanes, it might also make sense to have horizontal swim-lanes just to make a particular column of the board more readable. Here are a few example, when these horizontal swim-lanes are useful:

  • Swim-lanes based on priority: This can e.g. make it easier to see which tasks should be moved from the queue to In Progress. It also helps visualize whether we’re working on the important features first or got lost in prio 3 topics…
  • Swim-lanes per developer/tester: This just makes it easier to see which are “your” cards. It also shows whether one person is working on too many things in parallel and should maybe transfer some tasks to another person.
  • Swim-lanes for bugs and features: Depending on how you handle known bugs, it might be the same as the priority-based lanes. But it also help visualise how much you do for stabilizing the software and how much for getting new customers.

Here an example:

Define Develop (0/10) Code Review (0/5) Test (0/10) Communicate
Henri
Alex

Another thing you might consider is changing the direction of the flow. In most Kanban Boards, the flow is like this:

To Do In Progress (0/5) Done
Flow

But you might choose to have a vertical flow on the board and the priority horizontally.

In Production In Release Development Support
Done
In Progress
To Do
Flow
Priority

So even though the process you describe is the same, there are still a few tweaks you can perform on the board which can be helpful. Of course, you have even more possibilities with the cards and what you visualize on them. You just need to keep in mind, that Kanban is supposed to show you your process now and help you trigger improvements. So the last thing which should be static and cast in concrete is your Kanban board !

What is Kanban ?

Kanban is a system for controlling material flow and production using a pull model. It was originally developed in 1947 Taiichi Ohno of Toyota Motor Corporation. Kanban cards are used to visually control the production process.

Kanban is itself not an inventory management method but the introduction of a pull model and the evolutionary improvements do have a positive influence on inventory and storage.

Kanban is also not a new process you introduce. It is a method to make bottlenecks in your existing process apparent and introduce regular improvement in small steps. So it is e.g. well possible to introduce Kanban in an environment running Scrum.

Main principles of Kanban

Visualization

The first principle of Kanban is to visualize the existing value chain. It is basically the first step in introducing improvement: know what the current process looks like. It is then of course very important to show the current process as it is lived, not as it is described somewhere or as we wish it would be. Once the process steps are defined and the rules for the used process explicit, we have a starting point.

Defining the process can be done using any tool or method. You can perform a value stream mapping, create a UML activity diagram or use any other means to define the current process. Of course making it in a visual way makes it easier to understand and also easier to map to a board later on.

Usually a large white board or cork board accessible to all is used (see my next post about board design) for visualizing each process step (which is shown in a separate column). The individual units of productions are represented by cards or stickers.

Whenever you are exposed to lean methods, you will notice that a non-filtered visualization of the current process is always the basis for any method leading to improvement.

Pull instead of Push

Traditional production systems are based on push models. In such models, the production decisions are based on long term forecasts or scheduled which are themselves based on experience from the past e.g. past orders.
A push model works pretty well if you can predict the production needs in advance i.e. because the fluctuation in the demand is very low.
Push systems have two main problems:

  • They do not adapt to changes well.
  • They require larger inventories.

A pull system basically means that the production is demand driven. This prevents over-production of specific goods by only producing what has been consumed. This is important because you do not spend money and resources on something you cannot distribute or use. Pull systems also effectively prevent the overloading of staff and provide a sustainable pace of work.

It is essential to stick to a true pull system where work is never passed to subsequent process steps when finished (which would be a push model) but where new work is first fetched from the upstream process step when the previous work in this process step is completed.

Limitation of parallel work

Considering a production system with many process step, only introducing a pull model is not enough. Over-production and the production of not needed goods could still happen, if we do not limit the amount of work in parallel which can be performed in a process step.

Thus with Kanban, for each process step a limit to the number of tasks that are being worked on simultaneously can be set. For example, you can define that a development team should not work on more than 5 requirements or bugs at the same time. You can also define the same for the test team. If problems occur during the test process step and you are not able to test as many features or bugs as expected, once 5 cards are in the test column in parallel, no more cards can be pulled from the previous step. Since you also have a limits in the upstream process steps, this will prevent the previous process steps to produce output which cannot be tested anyway. So it makes the problems apparent and prevents overproduction.

Once the limit is reached in a process step, the upstream step needs to stop producing, as it’s output couldn’t be processed anyway. Resuming production in the upstream process step is done when a signal is received. The signal can be something physical but can also be the fact that a slot becomes free again.

Pull systems combined with a limitation of work in progress also allow to quickly identify bottlenecks in the workflow and thus offer good starting points for improvement.

Evolutionary improvement

Once you have visualized your process, use a pull model and limit parallel work, it’s time to get to the most important principle: improving things. This constant improvement of the process is called “Kaizen” which from the Japanese and means “change for the better”.

It is important that a continuous improvement process is established. This includes regularly scheduled coordination meetings in which team members work together on solutions for process improvement. Problems and errors made apparent by working with the Kanban board using a pull model with limited work in progress enter the continuous improvement process and are always seen as an opportunity to learn. So the improvement coming from this should not just be short-term corrections.

Established Kanban practices

Daily status meetings

Just like the agile daily meetings, these meetings are usually standup meetings. The team meets daily (usually in the morning) before the Kanban board where the project status is visualized. The progress since the last meeting as well as problems are discussed. However the meeting is usually short and longer discussion need to happen offline.

Operations Review Meeting

It is a kind of retrospective meeting which can be regularly scheduled or not. The operations review meeting is a critical feedback meeting which goal is to analyze the data gathered so far in both a qualitative and quantitative way. The gathered data can be Cumulative Flow Diagrams, reports or just stories and anecdotes about what happened. This meeting is the feedback loop allowing continuous improvement.

Root Cause Analysis

In order to continuously improve, you need to not only identify bottlenecks and issues but also understand why we have these bottlenecks and issues. This is the purpose of a Root Cause Analysis. The goal is to fix the source of problems rather than only mitigating the issues.

Summary

Kanban quickly creates transparency on the progress of the project and acute problems by visualizing the process and using a pull model with limits to the work in progress, which makes such problems apparent. But Kanban is not about introducing a new and better process. It is about making your existing process transparent and improving it step by step. This is the greatest advantage of Kanban: it’s introduction usually causes relatively little resistance as the original process is preserved (though improved in an evolutionary way) and roles and responsibilities are also preserved.