Most Utilities Would Agree That a Policy of Run-to-Failure is Not a Prudent Strategy for managing distribution assets, in light of its anticipated impact on system reliability and safety. Yet, many utilities have defaulted to this policy by failing to initiate a program of preemptive infrastructure replacement. In some cases, this has resulted in criticism and even unwanted involvement from regulators.

What has prevented many utilities from making the investment that they feel they should in infrastructure replacement has been their inability to clearly predict the payback. Understandably, a “Let's see what this does” philosophy is hardly a compelling business case for a multimillion-dollar investment. Southern California Edison (SCE; Rosemead, California, U.S.) has developed a method for performing long-term probabilistic forecasts of the reliability of its distribution system as a function of investments in preemptive infrastructure replacement. Simply put, it's a method to calculate the payback.


Despite routine, often regulated equipment-inspection programs, the reliability of most electric distribution systems is dominated by in-service equipment failures. Furthermore, observations of the system average interruption duration index (SAIDI) and system average interruption frequency index (SAIFI) indicate that this trend is increasing for many utilities due to failures of aging equipment rising each year.

This is not surprising. All components of the distribution infrastructure are getting older. The population of underground cable at SCE is aging despite the fact that the utility is adding new cable to the population each year to support customer and load growth. This aging trend is evident in all of SCE's major infrastructure components. When this is coupled with the well-known “bathtub curve,” which tells us that the probability of failure increases with age, the inevitable conclusion is that the challenge posed to system reliability by infrastructure aging is only going to get worse.


At SCE, underground cable clearly represents the most significant challenge as far as infrastructure aging. Therefore, while data also existed for switches and transformers, SCE chose cable as the subject of its detailed probabilistic analysis of system reliability. (Note that while the balance of this article deals with cable, the methods described here are applicable to all other significant infrastructure components.)

In its analysis, SCE sought to answer three questions:


    Where is cable on its approach to its long-term steady-state replacement rate? (See sidebar on page 30.)

  2. With no preemptive cable-replacement program, how would future cable failures impact system reliability?


    How would future system reliability be affected by various levels of preemptive cable replacement?

Like most utilities, SCE has an equipment database designed for engineering and maintenance use that contains detailed technical information on most major distribution equipment currently in service. Unfortunately, as at most utilities, cable is not included in the equipment database.

At SCE, the only significant source of data on cable is the capital property accounting record. Unfortunately, this database records changes in capital assets in relatively broad accounts. In the case of cable, inventories of all medium-voltage cable are recorded under one account, with no breakdown by conductor size or insulation rating. However, this database records how much cable was installed in any given year and tracks how much of that original installation was subsequently removed and when. While not everything an engineer might wish for is included in the capital property database, some reasonable assumptions by SCE made this data extremely useful. Since most of the primary cable installed in any given year was purchased under the same specification, the cable insulation type could be inferred. Therefore, the data enabled SCE to assess the current inventory of primary distribution cable by year of installation and by insulation type.


While SCE's engineering database includes the reasons for equipment removal (failure, overload, corrosion and idle), the capital property database does not. However, because cable is rarely removed for reasons other than failure, the assumption that all past cable removals were the result of in-service failure was considered reasonable. Nevertheless, with a data system not originally designed to facilitate engineering use, SCE opted to blend industry data with its own in-house data. Using all of that data, SCE then developed cable-reliability models for its various types of cable.

The data allowed SCE to answer the first of its three questions, listed previously, by enabling the calculation of the mean time to failure (MTTF). Assuming an MTTF of 34 years for tree-retardant cross-linked polyethylene (TR-XLPE) cable and an inventory of 46,000 miles (74,030 km), SCE would expect a long-term steady-state replacement rate of 1400 miles (2253 km) per year, a significant increase over the approximately 300 miles (483 km) currently being replaced annually.


SCE is outlining plans to develop an engineering database for cable that would allow it to better correlate cable reliability with characteristics other than age, such as voltage class, cable size, physical environment and cable loading. However, this is a long-term objective.

Considering that failures are a function of inventory, SCE's next step in its analysis was to ascertain how much cable would fail and be replaced in future years. This needed to be done one future year at a time. For instance, to determine how much cable will fail in 2009, the starting point is the inventory in 2008. For each vintage (year of installation) of cable in service in 2008, the number of failures expected in 2009 can be determined by multiplying the volume of cable of that vintage by the probability of failure of that vintage. The sum of all the failures expected from each vintage yields the total number of failures expected in 2009.


However, the inventory in any future year is a function of prior failures. To calculate the number of failures in 2010, for example, the starting point is now the inventory of cable in 2009 (or the year before the future year). The inventory of cable in 2009 must be derived from the inventory of cable in 2008, but adjusted to reflect the fact that all cable segments failing in 2008 resulted in their being replaced with new cable (whose age is now 0 years). The inventory of each vintage will decrease each year by the amount of failed cable segments. However, the total system inventory of cable will remain constant, due to the simplifying assumption of zero customer growth.

SCE had to review its distribution-system historical outage records to translate the number of in-service cable failures into system-level reliability indicators (SAIDI and SAIFI). Based on historical records, SCE estimated the average impact of a single cable failure on SAIDI to be about 0.0208 minutes. This number takes into account the amount of mainline cable versus radial cable and the difference in which their failures affect the circuit.

Determining the impact of cable failures on SAIDI in future years is simply the product of the forecast number of cable failures times the estimated reliability impact per failure. The result shows that cable failures would cause SCE's SAIDI to increase from its current 19 minutes to about 72 minutes over the next 25 years, assuming a 3% growth rate.


SCE's final step in its analysis was to assess the impact of preemptive cable replacements. This was done by selecting a volume of cable, say 100 miles (161 km), for evaluation that could be preemptively replaced annually as part of a long-term program. This was reflected in the cable-inventory calculations by reducing in each year the inventory of the oldest (most unreliable) cable by 100 miles and increasing the inventory of the newest (zero years) cable by 100 miles. After this was done, the amount of cable removed due to in-service failure was calculated for that year. This was repeated for each subsequent year. Obviously, with smaller inventories of the least-reliable cable, fewer in-service failures would occur. SCE performed this recalculation several times to show the impacts of various levels — 200 miles/year (322 km/year), 300 miles/year (483 km/year) and so on — of preemptive cable replacement.

There are several conclusions SCE drew from the various calculations. The first was that the potential increase in SAIDI from future cable failures would not be trivial. Adding 50 minutes of SAIDI would constitute a significant reliability reduction.

A second observation was that even with large volumes of preemptive cable replacement (i.e., 800 miles/year [1287 km/year]), reliability due to cable failures would still decline below its current level.

Finally, SCE concluded that system reliability has tremendous inertia. It is not quickly turned. The difference in SAIDI among various levels of cable replacement would not be dramatic in the earliest years of a program. It could easily be lost in the “noise” of weather-related interruptions or the randomness of failures. Those expecting to see immediate responses in reliability following even very large investments in infrastructure replacement would probably be disappointed. It would likely be only after several years that the payback of an infrastructure replacement program would be evident.


These methods and analyses reflect SCE's initial efforts in correlating system reliability with preemptive replacements of infrastructure. Follow-up analyses to extend forecasts out to 100 years are already underway. Despite their reliance on simplifying assumptions and less-than perfect data, the forecasts are reasonable and helpful. They provide objective, quantitative and transparent evidence not only of the reality of infrastructure aging, but also of the options available for meeting its challenge and the associated costs.

There are many unanswered questions: What is the value of reliability? How much preemptive replacement is too much? What would be the value of more inspections, testing or circuit design changes?

However, this is a significant step in the right direction. Asset and reliability management should be quantitative and risk based. Now they can be. The payback of large long-term investments in infrastructure replacement can now be demonstrated by analyses that can be reviewed, critiqued and enhanced.

Zoilo S. Roldan is a senior engineer in SCE's T&D department, where he assesses and manages distribution system reliability through risk-informed infrastructure replacement. Earlier, Roldan had been part of the Nuclear Safety Group at the San Onofre Nuclear Generating Station performing outage risk assessment. Prior to coming to SCE, he worked as a startup engineer with Bechtel Power Corp. Roldan holds a BSEE degree and a California PE license.

Shan (Sam) H. Chien is a senior engineer in SCE's T&D department, where he performs system reliability analysis and develops equipment failure probability. Prior to joining the T&D industry, Chien was a probabilistic risk assessment engineer at the San Onofre Nuclear Generating Station, having gained experience at Pickard, Lowe, and Garrick, and Bechtel Power Corp. Chien received his Ph.D. degree in nuclear engineering from UCLA and holds a PE license in California.

Roger J. Lee is the manager of asset management and system reliability in SCE's T&D department, where he oversees efforts to assess and manage distribution system reliability through risk-informed infrastructure replacements and design enhancements. Prior to coming to the T&D industry, Lee worked at SCE's San Onofre Nuclear Generating Station, where he managed probabilistic risk assessments, design reviews and licensing projects. Lee holds a BSME degree from the University of California at Irvine and a California PE license.


While obvious to many, the concept of a long-term steady-state replacement rate is so critical that it must at least be mentioned. By way of illustration, distribution infrastructure can be likened to a population of lightbulbs.

Imagine a factory being refurbished overnight with 10,000 new lightbulbs. In the following days, the maintenance department would likely find a few burned-out bulbs, as the break-in period of the “bathtub curve” would predict. But after that, and for the next several weeks, since the typical incandescent lightbulb has an average service life of about 2000 hours (or about 85 days), fewer bulbs would need to be replaced. Eventually, however, the number of bulbs needing to be replaced each day would slowly trend up. Ultimately, the replacement rate of lightbulbs would level off at an average rate of about 120 bulbs/day.

This plateau is the long-term steady-state replacement rate and is the average rate at which the factory would be replacing bulbs indefinitely. This rate is calculated by dividing the population — in this case, 10,000 bulbs — by the average service life, or the mean time to failure (MTTF) — 85 days. Replacing an average of 120 bulbs/day would be an unavoidable part of the cost of operating the factory. The only way to reduce this replacement rate would be to either reduce the total number of bulbs in the factory below 10,000 or increase the MTTF of the bulbs beyond 85 days.

Each type of electric distribution infrastructure component represents a different factory-lightbulb scenario. As shown in the graph, different types of distribution components in Southern California Edison's (SCE; Rosemead, California, U.S.) system are at different points on approaching their long-term steady-state replacement rate plateau. The good news is that SCE expects to see its replacement rate of overhead transformers ultimately increasing by only about 10% (based on a simplifying assumption of no customer growth). The bad news is that SCE expects to see its replacement rate of cable ultimately increasing by roughly 400%.

SCE is typical of many large utilities whose significant growth began in the early 1950s and that began to install large amounts of underground cable in the early 1960s. The point to be made here is that the future challenge posed by infrastructure aging can range from a modest increase in capital requirements for some types of components to a potentially serious financial problem for others.

The first step every utility should take in dealing with its aging infrastructure is to understand where the current replacement rate for each significant infrastructure component is in its approach to its long-term steady-state replacement rate. Without this long view, no utility can appreciate the magnitude of the challenges ahead.


Developing a 25-year prediction of reliability to support investment in infrastructure replacement required a little innovativeness and a fair amount of skill in probabilistic analysis. Fortunately, authors Zoilo Roldan, Sam Chien and Roger Lee exercised both of these in their years at Southern California Edison's (SCE) San Onofre Nuclear Generating Station performing probabilistic risk assessment (PRA). San Onofre was one of the first nuclear plants to use PRA methods in managing risks during the vulnerable periods of refueling (“Managing Outage Risk,” Nuclear News, June 1992, Vol. 35, No. 8). San Onofre was also the first U.S. nuclear plant to develop a tool for recalculating the plant PRA in real time as equipment status changed. For this, San Onofre received the “Top Industry Practice” award in 1996 from the Nuclear Strategic Planning Conference.

PRA is a technology used to calculate the probability of nuclear accidents by systematically analyzing all of the possible equipment responses to a hypothetical initiating event. Not only does PRA provide the bottom line of nuclear plant risk, but it also provides insights into how important each constituent system, component and procedure really is to the bottom line — or isn't. With PRA's comprehensive view, systematic process and quantitative nature, its insights are sometimes surprising and counter intuitive.

The history of PRA in the U.S. nuclear power industry contains many examples of significant but low-cost enhancements in systems and procedures, and also of cost savings from reductions in equipment, maintenance and regulations where PRA had demonstrated low benefit/cost. “PRA gives us a quantitative best-estimate forecast of what we will get for our investment,” says Roger Lee, manager of asset management and system reliability in SCE's T&D department and former manager of the PRA group at San Onofre. “And it presents the analyses in a way that is easily understood by an independent reviewer.”

If these capabilities of PRA sound too good to be limited to use in a nuclear power plant, Roldan, Chien and Lee would agree. They hope to make sure that SCE's T&D organization is able to take full advantage of the power of PRA.