HEACH: causes

Posts tonen met het label causes. Alle posts tonen

maandag 8 april 2013

Barriers Trilogy - final installment

The third part of our Barrier Trilogy can now be read on SafetyCary's blog.

Please keep coming with your feedback!

donderdag 2 augustus 2012

Some thoughts... 'The' Cause (?)

The people who know me, know that I tend to reject the notion of ‘the’ cause. Let’s explore the concept a bit. I believe there are two main variations on ‘the’ cause.

1. A simple philosophy about causal relationships. Everything that happens has just one cause.

2. The elevation of one particular cause as the most important.
3. A very strict definition of the word cause

Re 1: Monocausality

I do believe in the existence of simple accidents with a straightforward linear causal sequence (and have experienced some), not unlike the one below where each effect is basically the cause of the next effect until we come to the final consequence. Working backwards from the final consequence we’ll have a continuous sequence of why - because relationships. E.g. I’m on my way out (context, not cause), change my mind and turn around abruptly only to bump into the door that closes behind me. Some cases might be that simple that causal chains even may be restricted to just one cause and one effect…

Often, however, the world is not quite that simple and causal paths develop more or less independently, only to join up at some point causing some outcome. Please see the example discussed elsewhere on this blog; I don’t see how that can be turned into one linear sequence without dismissing a number of essential factors.

Re 2: More important causes

Are some causes more important than others? In some sense, yes. Especially if one defines preventive and corrective actions to remove them one may want to focus more on one cause than on others. But that’s then ‘more important’ in the sense of prioritizing resources and actions, not ‘more important’ in a causal sense.

In a causal sense it’s a bit more difficult… Leaving aside the discussion if an underlying (or root) cause is more important than a direct cause (we may get back to that issue another day), I find it hard to say that cause A is more important than cause B because it caused the incident more than the other cause… If an investigation has established that both were necessary for the incident to happen I cannot maintain that one should be ranked above the other.

Take the lab example: both not wearing goggles and mixing the ingredients wrongly are needed for chemical substances hitting the eye of the victim. How can one be more important than the other? Sure one might argue that the wrong mixing is the point of loss of control, or the first barrier breached. But still, both are needed for the defined incident.

For now I remain unconvinced that some causes are more important than others. But please supply viewpoints of your own.

Re 3: Strict definitions

Some define the word cause very strictly, for example by only allowing a deliberate act, or discarding conditions as causes (see elsewhere). Or by limiting the meaning to the direct cause (which in effect boils down to option 1). The added value of this isn’t quite clear to me so far - apart from eventual legal use… or as an alternative for what I often would call a direct cause (like in the lab example). But my opinion on the value may change as discussion progresses and knowledge grows…

Some thoughts... Causes (and more)...

Causes appear to be difficult things and causation appears to be a difficult subject. A brief check of safety books in the shelf over my desk shows astonishing variations. Heinrich decides to focus on direct causes, Hendrick & Berner completely reject the notion in their 1985 book “Investigating Accidents With STEP” and Hollnagel appears to reject root causes (see his great 2004 book “Barriers And Accident Prevention”). All explain things differently and have their reasons: Hendrick and Berner choose to equate cause to blame, Hollnagel not as much rejects root causes altogether but the notion of the root cause and application to intractable systems (see pages 105 and 106 of his ETTO book) and for Heinrich I’d like to point to my earlier discussion of his work.

As if this isn’t enough safety literature is littered with an incredible number of terms: direct causes, basic causes, underlying causes, root causes, proximate causes, latent conditions, unsafe acts, contributing factors, causal factors, and what not. While there seems to be some kind of general acceptance and understanding of those, one quite often comes across differing interpretations and then flaming discussions are adding to the confusion.

Something that gets me rather confused are ‘proximate causes’. This may be partly due to my ignorance (and the fact that this term is not used in the Dutch language when you study safety or law). Heinrich defined these as “unsafe personal act or unsafe mechanical hazard that results directly in an accident”. Heinrich’s definition is for me clearly a definition of ‘direct causes’. This is agreed upon by that most scientific of all sources, Wikipedia: “In philosophy a proximate cause is an event which is closest to, or immediately responsible for causing, some observed result”. Equating proximate and root causes (as I saw not too long ago) is a concept that I find difficult from a contemporary safety science point of view and in my opinion not very helpful for safety work either.

Conditions and causes

Many safety professionals appear not to be aware of the difference between causes and conditions (you may call the latter context as well, if you want), which sometimes ends up in quite strange causation, all too often blending in context. As an illustration: a few years ago I was one of the people responsible for cleaning up the ‘taxonomy’ of causes in our incident registration/management system. The status at that moment was one of unguarded organic growth over 15 years. Originally the off-the-shelf version had contained Bird’s sequence as the three main categories: Management Failures, Basic Causes (with sub categories for Personal Factors and Job/Environmental Factors) and Direct Causes. Into this structure there had been added elements over the passing of years without any proper policy or philosophy. Often the new elements ended up in ‘wrong boxes’ (e.g. ‘planning’ as a direct cause) and there were exceptionally many context items (e.g. ‘performing work in the rail tracks’) mixed in. Sure, absence of these things would have ‘prevented’ an accident from happening, but it would also have prevented accomplishing the intended outcome. So there is a lesson here, since I do not consider the people who worked with this system in the 15 years before me as complete idiots (something I’m not even qualified to diagnose anyway).

Among professionals there has been the debate if and how an (unsafe) condition can be a cause. I’m in the camp of people who think some conditions can be causes, but not always and automatically. Hart and Honore posed that “causes are what made the difference, mere conditions on the other hand are just those [things] that are present alike both in the case where accidents occur and in the normal case where they do not; and it is this consideration that leads us to reject them as the cause of the accident, even though it is true that without them the accident would not have occurred…”. I find this a rather useful and clarifying definition. Let’s apply it to some examples:

If someone bumps into a lamp post out on the street I agree that we’re talking about the lamp post as a mere condition. Sure, had it not been there, no one would have bumped in it. But reasoning in the line of “what would have prevented the accident from happening” and labeling those things causes is counterfactual reasoning and a fallacy without establishing a proper causal relationship first. The lamp post is intended to be there, was built in accordance with relevant standards and it stands there in the normal case of people not bumping into it and is just minding its business of lamp-posting.

If a workshop burns down after a discarded match sets discarded scraps of paper on fire, I do not agree with the view that only the match was the cause of the fire and the scraps of paper a mere condition. This based on the fact that I refuse to see scraps of paper thrown on the ground as being the normal case. If dumping your trash on the ground is acceptable (regarding it as an unwelcome but normal standing condition), then discarding a match is the same (after all people throw down matches and cigarette butts all the time, most of the time not causing fires). Even more since discarding a match on a concrete floor with no flammable material present would not have made the difference either. So in this example I see two things joining up as causes, namely the act of discarding the match and the condition-turning-cause of the scarps of paper lying there. Counting the last bit of the old-fashioned fire triangle (i.e. oxygen) as a cause, however, would be bullocks. Merely based on the ‘normal case’ argument… Absence of oxygen in this situation would hardly qualify as normal, would it?

Cause-effect relationships have to be explained with logic and must be based on facts (not hunches or guesses, not even on experience!). Anything written down in an investigation report must be able to stand up in a court of law, but hopefully never will. I find the “beyond a reasonable doubt” criterion a good one, and in the case that conclusions have to be based on assumptions/theories/hypotheses this must be clearly stated as such.

The example elsewhere on this blog hopes to illustrate the point about causes and conditions further. This example also shows another difficulty: it’s fully possible to write down acts as conditions and vice versa. The discarded newspaper isn’t quite as clear as the example where the condition ‘not wearing safety goggles’ essentially is the same as the act ‘does not put on safety goggles’.

By the way, a legal background may partly influence what to call a cause and what not because law can only handle people and thus will be focused on human acts. If there are any conditions identified the question usually will be who has caused them. In safety this appears to me as a not very useful notion. In my view cause in law is not quite the same as cause in safety and I find it not useful to define causes in a strict theoretical and legally sound manner. This will not really help preventive actions, but is probably great for judges and lawyers.

So, while I have learned to appreciate some of Hart and Honore’s work as very useful and clarifying (they have been added to my personal list of things-to-read-when-I-grow-up) I will look for causes in a safety context. Especially since the goals are different. Causes in safety must help to define actions while causes in law help other goals. In the newspaper example I can imagine that a judge rules a discarded match being an act of greater carelessness than discarding a newspaper. And I would agree, one reason I could imagine this being ‘more’ cause than the newspaper. Still, you need both and both are acts out of the ordinary that together ‘make the difference’, so in safety terms it’s useful to regard both as causes.

There’s another reason why I’m not particularly happy with thinking about causation in the legal sense: laws and even entire legal systems are quite different from one country to the other (just compare the UK to Germany or France) which may, or will, have consequences for the legal interpretation of or legal requirements for the term ‘cause’. In contrary the language of prevention in safety should be a common one that is understood by all safety professionals, regardless their countries of origin.

Conclusions (sort of)

Three things so far…

Causes in law and causes in safety are not necessarily the same thing. Beware of mixing them up.

The distinction between causes and conditions/context is an area that many safety professionals have too little awareness of. I’m sure that more focus and greater clarity on this will strengthen incident analysis and recommendations for preventive action.

It would be a good thing if the safety world would start to agree on what we call cause and what not, if necessary with some additional cause categories (direct, root, …). Or maybe we should stop using the word cause altogether and use another more neutral term? ‘Contributing factor’ has been proposed and Alan Quilley came up with “factors to be considered to prevent recurrence of the events that led to the unintended consequence”, but I believe FTBCTPROTETLTTUC is an impossible to sell acronym.

Feel free to comment and add more thoughts!

Some thoughts... Management Failure versus Free Will

An interesting point that I read some time ago, is that management failure theory would conflict with the notion of free will: if management failures are the root causes for all accidents then management failures are also the cause for human errors, and not free will. I’m not sure if this very strict reasoning is fully correct. Even if it does sound logical it doesn’t feel right to me. In my opinion at least Bird’s dominos and also Reason’s Swiss Cheese Model (proponents of multi causality) do not exclude free will from kicking in into the causal chain. The basic causes in Bird’s sequence differentiate in job factors and personal factors the latter explicitly including things like motivation which is at least one personal factor that is clearly related to free will. I’m rather sure that Reason has similar mechanisms, but I’m too lazy to check.

One might point out that still the most left ‘domino’ (management) causes the fall of the next (which includes the personal factors) and does not describe for causes materializing half way the sequence. I sometimes get the impression that models sometimes are treated as if they were laws of nature that have to apply in each and any situation, in exactly the same way and the same order (a hair-rising example is the treatment by some people of Heinrich’s ratios, expecting to find the same everywhere). But that’s hardly realistic. Heck, that’s what they’re called models for - a simplified representation of reality and thus not describing all and every possibility.

I live in the belief that nobody rises in the morning and goes to his job with the intention to screw up massively and create an accident. There are others much more qualified than me writing about human error (including violations) and its causes (and have luckily done so), but roughly I’d say that two important reasons/causes for human error or violations are found in: 1) an overly optimistic perception of their control over the situation (remember that about 90% of all drivers believe that they’re better than average drivers) and 2) especially conflicting objectives. These are ‘causes behind the cause’ for deliberate acts and for safety work it is eminent to identify those in order to determine preventive actions that keep future errors and accidents from happening.

For a legal case it may be sufficient to stop having established as a cause that someone willingly chose not to follow a safety rule. Acting on that single act (e.g. by punishing the violator, or explaining the rule to him once more) is often very ineffective from a safety point of view. The decision not to follow a safety rule may be a deliberate act of free will, but there may have been incentives and other mechanisms behind this decision. In case of conflicting objectives (e.g. the company claims ‘Safety First’, but rewards people cutting corners in order to maximize profits) it’s more effective to address those causes behind the causes.

When studying safety many of us have learned not just to focus on an error by an employee, stop the investigation there and blame the person. Instead we’re taught to look further than the person at fault. But the opposite is true as well: defaulting to management failures as causes for accidents is not the way things should be done. That’s a kind of jumping to conclusions without any basis in reality or facts that is just as bad as Heinrich’s decision to stop at the direct cause and focus on that alone. A comment recently I heard was in the line of “we’re not satisfied unless we have found at least one management cause”. I’m convinced that this is said in the very best of intentions for the improvement of safety, but even this well-meaning framing of your mind is going to bias the result in a way that should not be acceptable. Remember what Hollnagel said: WYLFIWYG! I believe he said it in relation to ‘human error’, but it applies to anything. Some causes are simply not management induced and that’s that.

Some people go a step further and reject the existence of management failures entirely, a.o. because these are human failures. Agreed, in the end, management/organisational failures are decisions of men and thus human failures, but they are of another kind. I find it helpful to distinguish between (direct) causes on a more personal level (call it sharp end/operational, if you want) and (underlying) causes on organisational or management level that are further upstream. Additionally, sometimes it’s hard or impossible to determine what the ‘deliberate act’ or more or less active failure in the management was.

Take for example the accident that happened at Sjursøya on 23 March 2010, something that has taken a considerable amount of my working hours since. The full report of the Norwegian Accident Board (a fairly good one, even though I do not support all of the recommendations) is online, check it out for details (available in English). Short version: operational error(s) directly caused a runaway set of goods wagons which went unstoppable downhill. There is a 100 meter difference in height between the point of origin and the ending point, 8 km further down in the harbour of Oslo. There the wagons, speeding at about 130 km/h crashed in a building and killed three people, injuring several others. The outcome could have been even worse, by the way, since the wagons might have hit wagons with jet fuel, had the accident happened at another point of time.

What lies behind the operational error (I’ll do the simplified version) is that over the last two decades the use of the goods terminal had slowly been changed without anyone noticing and many baby steps ended up in using the terminal differently than intended. Working procedures and local risk analysis had not followed the same development and thus safety barriers had unknowingly been eroded in such a way that a rather simple error could lead to such a drama.

I tend to be very critical towards the labeling of deficiencies in risk analysis as causes - often this is a sign of lazy or unrealistic safety people since you will almost always find something related to the accident not being in the risk analysis (a complete risk analysis being a fiction and not very helpful either). But here it certainly was the case and together with other management factors this created over many years (impossible to pin down on persons, acts or dates) the situation as it was on the date of the accident without anybody being really aware of the situation sliding towards the breaking point.

While blaming the operational personnel that had ‘violated’ long forgotten rules would have been a possibility (unnecessary because they blamed themselves enough as it is) it was decided to look at the underlying factors. I don’t want to discuss the legal part, but also the DA chose not to have a legal action against the operational personnel. The companies involved, however, were fined which my company paid without any appeal (in contrary to the other company involved). Negative side effect: several millions of Norwegian kroner down the drain that could not be used for prevention and improvement.

Comments and discussion appreciated!

donderdag 19 januari 2012

An example and various solutions

An example
Take the following example which hopefully demonstrates why I believe that:

multi-causality is a valid and usable concept (this is especially true in implementing improvements to safety systems and behaviours),
this may include multiple direct causes,
that causality not necessarily goes in strictly linear paths, but often in parallel paths that meet up at a certain point,
that management not necessarily is the root of all evil, and
causation models and tests that fit tort and criminal law definitions don’t necessarily align with prevention goals.

The case
Someone in a laboratory in a university has suffered partial loss of eyesight after he had boiling acid liquid in one of his eyes. He was not wearing the mandatory safety goggles; instead he was wearing his own glasses which he needs in order to be able to see details at working distance. He was blending two chemical substances but did not do this in the described order and volume, causing the chemicals to react more ‘enthusiastically’ than intended, causing part of the acid liquid to reach boiling point and evaporate/explode sending drops of acid liquid into the immediate area - one or several of those hitting the insufficiently protected eye.

The victim was a bit in a hurry. It was Friday afternoon and he wanted to get home as soon as possible. He just needed to get this job done; clean up his part of the lab and it would have been weekend for him. So instead of following the proper procedure he was a bit careless and used greater volumes and the wrong order (started with the ingredient he randomly picked up first). He didn’t wear safety goggles because he never does. He tried the standard issue as provided by the laboratory several times, but those fit very badly in combination with his ordinary prescription glasses, irritating nose and ears immensely. This has been reported to the department head on several occasions, by several employees without resulting in a better fitting alternative. Also suggestions to provide contact lenses that would fit with the standard safety goggles have been turned down. In addition did the victim believe that his common glasses would provide sufficient protection since they cover the area in front of his eyes. The supervisor was aware of the situation and had noticed that the victim consequently doesn’t wear his safety goggles, but chooses to ignore this fact.

The victim is experienced, properly instructed, etc. The failure to provide suitable equipment is explained by the department head with reducing his expenses to the bare minimum. He is on a tight budget (government cuts in budgets to universities and lack of corporate sponsoring due to financial crisis) and has planned to spend all the money he has on a laser-ion-mega-spectrometre which will improve his department’s possibility to do analysis of the substances they work with. Besides, the university does provide standard safety equipment in accordance to CE standards and that should be good enough. The supervisor informs us that he is fairly new in his job and used to be ‘one of the boys’ before he was promoted. He was the best qualified among the applicants and the only internal applicant filling all the formal requirements. He finds it a bit embarrassing to start picking on safety rules he hasn’t followed all that closely himself before he was promoted.

Some solutions
Many thanks to Jeff Harris and Alan Quilley for supplying alternative solutions listed below! Great to see some alternating versions and views - this only contributes to learning, there is no perfect way anyway.

Starting with my own approach, I’ll draw the incident as a causal tree, a method I prefer because it illustrates both causal connections and (to a lesser degree) chronology. In contradiction to some opinions, I would note this with two separate direct causes.

Alternatively (and fully justifiable) one could choose to see the point of “loss of control” as the direct cause and then picture the accident and its direct cause as follows:

There are certainly many other and different ways one could write down things, many of them correct. All of the relevant information in the case (so far) is present, after all. One drawback of this particular notation and combining relevant elements in one ‘box’ (i.c. the accident and the missing goggles), however, is that it’s going to be harder to explain in a logical way how the lack of goggles played a role, which it does in my opinion. But see Jeff Harris’s solution below as well!

This is for me one reason to argue that many accidents will not have linear causal relationships. There is no way in this example that you get the wrong blending of chemicals and the missing goggles in some cause/effect or why/because relationship to each other, so they must exist parallel and join up at a certain point to produce the final effect. And mind you, both are not sufficient in producing the accident in themselves. Just leaving off goggles would result in what we could call an unsafe condition, while the act of wrong blending and thus boiling/exploding liquid with goggles on would lead to what we tend to call a near miss (as far as eyes are concerned - acid in your face is no great fun either).

We could stop here, by the way, if we would follow Hart & Honore’s rule of having met a “deliberate act” (in my notation even two, parallel, acts) being a “barrier” for further investigation. That, however, would leave the questions why the victim not followed the correct procedure and why he did not wear his safety gear.

So, I choose to continue my analysis and start gathering more facts, hoping to find an answer. What we find is the following and I’ll continue with my first notation in building a visual presentation of the case, adding the next layer of causes (for our convenience I add a background showing the various phases or dominos). We see that there are three underlying causes (parallel and independent) to the not wearing of goggles: 1) a conscious decision of the victim not to use the ill-fitting safety equipment because he trusts he is sufficiently covered; 2) not supplying suitable equipment by the management and 3) the supervisor choosing not to enforce the safety rule.

I am open for suggestion and arguments here if someone does not agree on some of these being actually causes, or if some of the factors should be considered more important than others. The most crucial in my eyes would be the not supplying of suitable equipment (especially from a prevention point of view), but I have a hard time excluding the other two.

We will take this investigation even one step further. With regard to the victim’s decisions/violation/errors no relevant causal information is found. He is experienced, properly instructed, etc. So I choose not to add more underlying causes here. But the underlying managerial causes can be taken one useful step further back.

One could obviously take the analysis even more steps back, but I think that the causal connection between “budgeting process” and “recruitment of supervisors” or even “government cuts” and “financial crisis” is going to be too hazy for this case. So I choose to (slightly arbitrarily) to stop here with ‘root causes’ both on personal/employee and management/organisational level.

Jeff Harris’s version

I break the incident down into two direct causes: the cause of the incident and the cause of the injury. The direct cause of the incident was improper mixing of the chemicals. The direct cause of the injury was not wearing the "mandatory" goggles. (both unsafe acts).

Then you delve into the root causes. Why did he mix the chemicals improperly - he was in a hurry and wanted to get home - a very common theme in incidents. He was properly trained - he just didn't follow the training. So why did he not wear the "mandatory" goggles? He said they hurt his face. The company would not buy him "special" goggles to make it more comfortable. Why not? Too expensive, they said. His supervisor never made him wear goggles. Why not? Apparently the supervisor did not always wear goggles and didn't feel like he could enforce that rule on other people (no lead by example). The managers over the supervisor either did not know or did not enforce the adherence to a safety requirement.

So what to do to avoid a repeat? Start by finding out how many other safety requirements are not being followed. Start enforcing mandatory safety requirements and if supervisors are not doing their job (enforcement) maybe you need supervisors who do. I personally feel it would have been a lot cheaper and easier for the company to have bought some special goggles for the employee, but that does not relieve him of the responsibility of wearing the goggles, even if uncomfortable. (If I had a dime for every time I was told a respirator was uncomfortable!) Changing the behavior of getting in a hurry is a much harder task. There you have to reach out and engage the "hearts and minds" to change the employees' attitudes about risk and what is acceptable. You won't always succeed. That is why the safety goggles are important: to minimize the injury when someone screws up.

By the way, where is the "unsafe condition" in this case. Oh yes, the dangerous chemicals. Well, if we just shut down the lab, fired everyone, and got rid of the chemicals, this incident would not have happened. :-)

Alan Quilley’s version