It is evident from Figure 1 that the data is very left skewed and spans several magnitudes of order from just a few barrels to several thousand. This is typical of data that is log-normally distributed. Log-normally distributed means that the log of the data is normally distributed. So the same data was plotted as the log values.
You can see that the data is taking on more of the familiar bell shape of the normal distribution but it still isn’t a very great fit. It would be logical that the spill volume was proportional to the diameter (bigger pipe = bigger spill) and probably proportional to the square of the diameter since a pipeline flow is related to flow area.
When the spill volume is divided by the square of the diameter the fit to the normal distribution improves even more. Because of this fit we can now make predictions of a likely spill volume based on the diameter for any arbitrary probability. For example if you wanted to know what the average likely spill would be for a 24" diameter line you can work backwards from the location the distribution at P50 and put 24 into the diameter and solve for the volume. The table below gives the likely spill volumes based on a P50 and P95 event for various diameters.
D | P50 | P95 |
---|---|---|
3.500 | 2 | 360 |
4.500 | 3 | 595 |
6.625 | 7 | 1290 |
8.625 | 12 | 2187 |
10.750 | 19 | 3397 |
12.750 | 26 | 4778 |
16.000 | 42 | 7524 |
18.000 | 53 | 9523 |
20.000 | 65 | 11757 |
24.000 | 93 | 16930 |
30.000 | 146 | 26453 |
36.000 | 210 | 38092 |
Because the data is log-normally distributed small changes on the log scale amount significant difference overall. The difference between a P50 and P95 event can be several thousand barrels. Figure 4 shows the leak spill data broken out by cause. The facet with no information on the type indicates that the cause was not supplied. The distribution curve is the curve for the overall distribution of the data to see if the individual cause differs from the data as a whole. ON a per cause basis, it appears that excavation and natural force damage have much higher spill volumes.
Figure 5 is the data broken out by upstream valve type. There is no cursory evidence based on the plots that automatic and remote controlled valves reduce the magnitude of the spill. In fact the spill volumes for remotely controlled valves appears to be significantly worse than the distribution of all the data as a whole.
Figure 6 shows the data broken out by leak type. There is no apparent different in spill volume for a crack versus a pinhole. One possible reason for lack difference could be that a pinhole is likely to leak much longer before detection and unlikely to be picked up by mass balance leak detection. The “other”category is a mix of issues so it’s hard to compare against anything and the blank indicates that no information was supplied.