In the first installment of this column, we introduced the concept of pipeline risk assessment Essential Elements. This is a list of ingredients that arguably must be included in any pipeline risk assessment. In this installment, let’s examine
“I can’t do good RA because I don’t have enough data.”
There are at least two aspects to this issue, both of which need some myth-busting. The first is related to the myth that the only way to predict the future is to have hard counts of how often something happened in the past. The second is related to an idea that certain types of RA require more knowledge of your system than other RA types.
Classical QRA—at least how its practiced by many—relies heavily on historical failure frequencies. I often hear something like “we can’t do QRA because we don’t have data.” I think what they mean is that they believe that databases full of incident frequencies—how often each pipeline component has failed by each failure mechanism—are needed before they can produce the QRA type risk estimates. That’s simply not correct. While such data is helpful, it is by no means essential to RA.
There is no argument that knowledge of your system is important. Without it, accuracy is compromised. Loss of accuracy leads to inefficient decision-making. So, the second misconception can be readily busted by simple logic. Which is better, weak data with a weak model or weak data with a strong model? Obviously, the latter. The strong model can’t completely overcome the limitations of ‘weak data’, but why compound the absence of data with a flawed modeling approach? In reality, any RA approach suffers when knowledge is low and is improved by better information.
In fairness, we can look at how the perception of ‘data hungry modeling’ came to be. Recall from earlier columns how the original indexing RA’s were designed for high-level screening application. They employed many short-cuts and did not seek accurate RA but only prioritization of risk issues. They did not attempt full use of data-intensive inspections (ie, ILI, CIS) nor location-specific changes in many risk factors. They made many generalizations using only very readily available information. Nonetheless, we called these RA, even though they accomplished only a fraction of what a modern RA needs to accomplish. So, these RA approaches seemed to ‘need’ only general data in contrast with how typical QRA’s were being conducted in other industries. In the minds of many, there was then a coupling of scoring-RA = low data needs and QRA = high data needs.
This idea of bias-control might at first appear as a rather obscure, highly technical issue only. However, it is actually an essential element and critical to proper risk assessment. It is essential to an understanding of the risk assessment and the subsequent use of the risk estimates. If not already defined, one of the first questions to ask when viewing a risk assessment is: ‘what is the level of conservatism in this assessment?’.
Model developers must choose the level of rigor to include in all aspects of a risk assessment. That level of rigor will often vary, even within the same overall model, to accommodate data availability and perceived importance of that portion of risk.
While additional modeling efforts can almost always be utilized, the additional effort should add a commensurate amount of benefits (ie, accuracy and/or utility) to justify that effort. This cost/benefit is difficult to establish. For example, a normally exotic failure mechanism will generally warrant little attention for the vast majority of most pipeline systems. However, for the rare pipeline segment that may suffer from this exotic failure mechanism, a detailed assessment, involving all possible inputs and modeling rigor, would be appropriate.
Data vs Algorithm Strategy
It is sometimes tempting to blend data considerations into algorithms, especially when one characteristic is dependent upon another. For instance, corrosion rates are closely associated with material types. So, an algorithm could say: if pipematerial = steel then use steel-corr-rate, else use 0. This practice should be avoided since it leads to unnecessarily complex algorithms and confuses the risk calculation with the data collection. The better, long-term solution is to assign the potential corrosion rate in the database as a characteristic of pipe segment X in location Y. The same if-then statement could be used for this assignment, but using it in the data integration phase rather than in the RA algorithm preserves its role as a characteristic—a piece of data. The meta-data associated with the database will then capture if any ‘rules’ were used in assigning the data.
Reductionism can mean either (a) an approach to understanding the nature of complex things by reducing them to the interactions of their parts, or to simpler or more fundamental things or (b) a philosophical position that a complex system is nothing but the sum of its parts, and that an account of it can be reduced to accounts of individual constituents. This can be said of objects, phenomena, explanations, theories, and meanings.
Reductionism strongly reflects a certain perspective on causality. In a reductionist framework, phenomena that can be explained completely in terms of relations between other more fundamental phenomena, are called epiphenomena. Often there is an implication that the epiphenomenon exerts no causal agency on the fundamental phenomena that explain it.
Reductionism does not preclude the existence of what might be called emergent phenomena, but it does imply the ability to understand those phenomena completely in terms of the processes from which they are composed. This reductionist understanding is very different from that usually implied by the term ’emergence’, which typically intends that what emerges is more than the sum of the processes from which it emerges.
The role of future inspections is an intervention opportunity in a chain of events that would otherwise lead to a failure. Current PoF estimates are for short periods into the future—the next 1 to 3 years assuming conditions remain the same. Under S1, inspection opportunities for the mostly buried pipe play a negligible role in risk estimates. Under S2, the benefits of inspection would not be realized until potential degradation mechanisms had advanced to the point where their effects could be detected and corrected. This would be beyond the period for which risk estimates are currently being produced. The effect of time passage on risk can be modeled using the same data and tools used in this assessment, but a risk-over-time analyses has not yet been conducted.
Pipeline Risk Assessment—Problems with Weightings
The use of ‘weightings’ is the first target of this critical review of current common practice. Weightings are most commonly used to steer risk assessment results towards pre-determined outcomes. They are usually based on historical incident statistics—“20% of pipeline failures from external corrosion”; “30% from third party damage”; etc. Implicit in the use of weightings is the assumption of a predictable distribution of future incidents and, most often, an accompanying assumption that the future distribution will exactly track the past distribution. This practice introduces a bias that will almost always lead to very wrong conclusions for some pipeline segments.
The first problem with the use of weightings is the underlying assumption that the past will reliably predict the future. Which past experience best represents the pipeline being assessed? Using national statistics means including many pipelines with vastly different characteristics from the system you are assessing. Using pipeline-specific statistics means (hopefully) a very limited amount of data. What about historical changes in maintenance, inspection, and operation? Shouldn’t those influence which data sets are most representative to future expectations? Answering such questions forces unnecessary subjectivity into the analysis.
As an example of the additional folly of applying weightings, let’s look at a sample application. Consider geotechnical threats such as landslides, erosion, or subsidence. An assumed distribution of failure mechanisms will almost certainly assign a very low weighting to this failure mechanisms since most pipelines are not significantly threatened by the phenomena and, hence, incidents are rare. But a geotechnical threat is a very real, although often very localized effect, on some pipeline segments. So, while the assumed distribution is valid in the aggregate, there will be locations along some pipelines where the pre-set distribution is very wrong. It would not at all be representative of the dominant failure mechanism at work there. The weightings will often completely mask the real threat at such locations.
This is a classic difficulty in moving between behaviors of statistical populations and individual behaviors. The former is reliably predictable—witness insurance actuarial analyses—but the latter is not. Both ‘individual segment’ and ‘population of all segments’ estimates are useful, but for different intended uses.
So, apologies for a somewhat critical tone in this first installment. It is solely in the interest of stimulating an awakening of sorts. In modern pipeline risk assessment, the ‘perfect storm’ or ‘needle in the haystack’ analogies are appropriate. The more obvious threats are well known and are likely already receiving a lot of resources from prudent operators. It is the more obscure and the ‘improbable chain of events’ scenarios that will escape detection unless sufficient rigor is applied to the assessment. As a first step in gaining such rigor, question the use of weightings.
Published June 2013