Sponsors

Using prediction theory for maintenance

    1. Lam, of the Department of Building Services Engineering, The Hong Kong Polytechnic University, puts forward further considerations for a maintenance management philosophy designed for the avoidance of failure and the improvement of availability of engineering services. In the September 2006 edition of Health Estate Journal, he examined the assessing and managing of risk, and now he focuses on prediction theory.

When building services maintenance solely relies on the experience of the individual engineer, results are often not wholly satisfactory. There are high risks in building services availability. Poor maintenance is a risk.

Risk must be assessed and managed before a design is fully implemented. The application of prediction theories (e.g. reliability evaluation of engineering systems) has not yet been widely used in the building services industry. Therefore, the quality of operation and maintenance is still not cost-effective and reliable. As high environmental performance and users’ expectations have led to increasingly complex systems, the application of risk assessment, prediction theories and Business-Centred Maintenance (BCM) should result in improved quality of building services operation and maintenance. The discussions in this article should be of great help to both designers and maintenance engineers.

Maintainability is an inherent characteristic of system or product design. It pertains to the ease, accuracy, safety, and economy in the performance of maintenance actions. Hence, a building services system (or product component or sub-system) should be designed such that it can be maintained without large investments of time. Maintenance should be at the least cost, with the minimum impact on the environment, and with the minimum expenditure of resources (e.g. personnel, materials, facilities, and test equipment).

Maintainability, as a characteristic of design requires the consideration of many different aspects involving system design, performance characteristics, reliability, human factors, safety, logistics, quality, reconfigurability, flexibility, testability, producibility, disposability, environmental considerations and economic factors. An easily maintainable system is costly in its initial procurement, but it will reduce the follow-on sustaining maintenance and support costs.

Nonetheless, if an equipment item is packaged with complex components, inadequate accessibility provisions, sophisticated built-in test and condition monitoring devices, then the complexity is increased. The failure potential is greater and, therefore, the reliability of the system is lower. During the use phase, a highly maintainable system can be repaired rapidly with a minimum of expenditure and supporting resources, without causing detrimental effects on the environment, and without inducing additional faults in the process. (Sometimes, it is better to exchange items in lieu of repair if the system is packaged with interchangeable components).

In any event, incorporating reliability and maintainability characteristics into a design leads to a reduction in overall lifecycle cost.

Review of terms/factors
For completeness, a brief introduction is given to some important terms which are needed for system design and planning of maintenance.

The objectives of a system design with maintenance analysis should provide useful information such as identification of problem areas; the causes of failure; the most critical components from an operational/safety viewpoint; the vulnerability of a system; the adequacy of the design; and the right planned maintenance and replacement strategies. Most importantly, from the analysis it can be determined which

components/sub-systems should have a particularly high reliability, be backed up by redundant components, be derated, be provided with special designs with other components/sub-systems or be subject to special maintenance procedures. All in all, the use of availability and reliability analyses allows suitable design and maintainability to be studied in detail. In addition, the designer can ensure that the different services are integrated in a logical manner as a total system thus providing the desired uptime for the total system.

Maintainability requires the consideration of many different factors, involving all aspects of a system (e.g. conceptual design, system design, production, installation, T&C, system operations as well as system retirement and life-cycle cost), and the measures of maintainability often include a combination of the following: Maintainability (M) The word “maintainability” refers to the concept of being able easily to maintain/restore an item in/to a state in which it can perform its required functions.

Maintainability is another measure of quality, and is normally expressed as mean time to repair (MTTR). Mean Time To Repair (MTTR) This is a measure of the duration of repair time.

Availability (A) under steady state The word “availability” refers to the concept of whether an item will be ready when it is needed. Availability is determined by reliability, maintainability, logistics, and administrative policy. The concept can be expressed as the ratio of uptime to the total time of intended operation. Specifically, uptime is the amount of time that the system is usable, and the total time is the amount of time that the system is needed. Downtime is the time during which an element cannot perform. Total time is the sum of uptime and downtime, as shown in the following equation.

A = uptime uptime+downtime

N.B. In designing for reliability the operational availability concept should be used (see use of reliability calculations). Reliability (R)

The word “reliability” refers to the concept of being able to depend on something. Hence, it is defined as the probability of an item performing its required function for a stated interval under stated conditions. Reliability is normally expressed as a probability. For example, a device that fails randomly in time but once a year on average, will have a probability of failing (PF) in any one particular week of 1/52. i.e., PF = 0.019. Conversely the probability of success (PS) i.e. not failing is 51/52 = 0.981 which is the same as 1 – PF. Mathematically these expressions can be shown as PS=e–t/T (N.B. There are many distributions, for example Welibull Distributions. Lognormal distribution can be used to express time to failure) and PF=1–e–t/T where t = the time interval during which success is required and T = mean time between failure of the device. System reliability is improved if each element is as reliable as possible and if simple layouts using few different parts are planned.

Mean Time Between Failure (MTBF)
This is a measure of the frequency of failure for repairable system, defined as running time/number of failures. The reliability of a large installation can be quoted as a single figure, but the MTBF or maximum failure rate should be quoted.

Mean Time to Failure (MTTF) MTTF is used for nonrepairable systems. This is the mean value of the probability density function.

Therefore MTTF = ò0 ¥ R(t).dt.

If the mean failure rate is constant, then MTTF = 1. l

 Use of reliability calculations

A product’s life can be approximated by the exponential distribution. (see table on next page).

Influence of reliability

Reliability at least cost is the basis of the development of an asset strategy. The aim has been to reduce downtime, and, as a result, there will be dramatic improvement in availability.

Equipment reliability can be considered as the characteristic of design which results in durability of the equipment item or system. This item will then operate successfully for a particular duration of time. Like most evaluation techniques, there is a limitation to the prediction. The approach is based on historical data of general components’ failure rates. It can only be valid for the conditions under which the data was obtained. The application of the reliability methodology necessitates the availability of reliability data on HVAC equipment, but the information is not readily available and the data will inevitably be coupled with limitations. So the details of the life of plant and equipment cannot give very accurate predictions. Nonetheless, the information provided will (with or without merging several failure data to get a more reliable figure) still be useful for the preliminary prediction. Based on this information, an engineer is able to develop a maintenance plan or schedule. Once the maintenance engineer has obtained sufficient statistical data of the services system for one or two years, he can modify some of the predicted schedules for the actual systems and the maintenance strategy.

In practice, the general conditions for setting a maintenance schedule can be devised if data on the failure rates of the equipment components is known. (see Example 1)

From the above calculation, in order to improve the performance of the pump, it is necessary to reduce downtime (i.e. to improve A) and a reduction of the number of failures on the pump set assembly (to increase R). From this information, it may be judged whether the system should contain a standby pump.

The reliability calculation shown has implications for the way availability and maintainability should be designed as a series of items of equipment.

Frequency of function tests


The importance, or criticality, of a function will be reflected in the required level of availability. 100% A is not impossible, but many engineers do not strive for it. This highest level can only be met by the provision of separate item(s) serving the same function (i.e. redundancy). The result of increasing the availability of an item capable of a hidden failure is to ensure that downtimes are short. This is also achieved by function testing to reveal failures which may have occurred. The frequency of any function test can be estimated by:

Time between tests = MTBF Log n (2 Av – 1)

The availability of an equipment function is directly related to the equipment reliability and the interval test. For instance, if a BS part requires 99% A and has an MTBF of 24 months, the time between tests should be: = – 24 Log n (2 x 0.99 – 1) = 0.48 months (or 2 weeks) Put another way, the time between tests is 2% of the MTBF for 99% A. With other A, the figures would be as shown in Table 1.

The function tests carried out at the various intervals confirm that the required availability is possible. It is also seen if the failure rate is known it is possible to provide a range of availability figures for given test intervals allowing a choice of availabilities and the management of maintenance task. As the time interval between the function tests extends then the level of confidence reduces. Then one can duplicate or triplicate the item of equipment in order to increase the probability of ensuring function availability.

In practice, the periodicity of function test is often determined from experience. Clearly, the function test based on calculation is a valuable technique as this will eliminate ineffective and unsuitable maintenance.

Calculations for reliability


Risk in relation to design and maintenance can be assessed by using a simple scoring technique. However, a quantitative approach is often adopted by many designers and is developed to meet clients’ availability requirements by using reliability and maintainability data and modelling techniques (such as failure modes and effects analysis) to assess likely availability.

In practice, building services systems are combined in series and parallel. The system reliability can be calculated by the use of simple calculations. (e.g. In series systems, there is R = R1 N and in parallel circuits, there is R = 1 – (1 – R1)N). Throughout the system design process, reliability models can therefore be used to assist in the accomplishment of reliability location, the evaluation of alternative configurations, and the accomplishment of reliability prediction. With this technique, risk assessment can be made much easier.

The following example (based on the CIBSE Guide Section 9) shows the usefulness of the reliability calculations. For quick reference, the system reliability for various arrangements of plant based on parallel redundant systems is calculated (see Table 2). The equations listed above represent only a small part of the reliability methodology that can be used in systems design. However, the calculations should be of great use to designers in assessing system reliability. The benefit of 100% redundancy (Arrangement b) gives 0.99 reliability. Arrangement c still gives 0.97 reliability. Arrangements d and e have sharply lower system reliabilities at full load conditions (under partial conditions, the system reliabilities increase substantially as redundant capacity emerges).

Obviously, standby redundancy gives higher value of MTBF and, therefore, higher availability. The point being made here is that as more plant is provided to enhance the reliability of the system, so the maintenance commitment is increased. Alternatively, it may be considered more appropriate to provide a more comprehensive maintenance scheme for the services which both eliminates breakdowns and the need for additional standby support facilities.

Availability as a design tool


In designing and installing building services systems, the aim must be to ensure the systems provide the functions required by the users and the best value for money for the owner. It is fundamental in providing a solution to an engineering function that the designer can, quantitatively, determine whether a design solution will perform as required and can compare the relative merits of various alternative design solutions.

An example of a simple hot water supply system having two 100% parallel pumps should show the application of analysis and the way to increase the reliability.

Data
4,380 hours/year operation.
Failure rate of a pump is 30.259 x 10-6 hour.

Calculations

Reliability of a pump (RP1)per annum =e(–30.259x10-6)(4380) =0.877

Reliability of the pumping circuit (parallel circuit with two pumps only and one of the pumps must be successful)

= 1 – (1 – RP1) (1 – RP2) = 1 – (1 – 0.877) x (1 – 0.877) = 0.9848 (0.985)

Probability of failure for the pumping Circuit

= 1 – 0.985 = 0.015

Downtime for the pumping circuit

= 0.015 x 4380 = 65.7 hours

To improve the reliability, a third pump can be used (in parallel circuit)

R = 1 – (1 – RP1) (1 – RP2) (1 – RP3) = 1 – (0.123 x 0.123 x 0.123) = 0.998139

The probability of failure

= 1 – 0.998139 = 0.001861

The revised down time will be = 0.001861 x 4380 = 8.1512 hrs.

It can be seen from this calculation that the reliability of the hypothetical hot water supply system can be improved by using one more pump. Furthermore, the downtime can be reduced from 65.7 to 8.15 hours. However, the weakness on adding a component includes:

Increase in capital cost.

Increase in maintenance needs.

Increase in space, weight and power needs.

Hence, this system reliability must be evaluated against the cost of interruption to the hot water system. With this prediction method, a designer can make better analysis with regard to improved design and maintainability. The calculations provided will also enable the client to assess the suitability of the proposed system.

All the examples given have demonstrated the use of prediction theories. Reliability calculations can be used to quantitatively compare different systems. By evaluating the reliability and costs of different solutions to a particular design, it is possible to iteratively develop a system which optimises the desired reliability, availability and cost.

Business-Centred Maintenance (BCM) is a systematic process used to determine what has to be accomplished to ensure that any physical facility is able to continuously meet its designed functions. BCM is an engineered process used to determine the maintenance requirements of any physical asset in its operating context by identifying the function of the asset, the causes of failures, and the effects of failures. BCM advocates condition-based maintenance and reassessing the system design. Most importantly, the maintenance process is based on a detailed study of risk, availability, reliability, maintainability and engineering economy.

BCM leads to a maintenance programme that focuses preventive maintenance on specific failure modes likely to occur. Any organisation can benefit from BCM.

Conclusions

For years maintenance was a craft learned through experience and rarely examined analytically. A trend involving hospital projects ever increasing in size and complexity, and building services performance requirements becoming greater for the proper functioning of a hospital, has led to increasingly complex system design and maintenance. It is necessary to adopt more engineeredbased decision support for building services design and maintenance.

The current approach to managing building services maintenance is mainly based on engineers’ experiences and the results are not the best possible – there is high risk in building services availability. It is fundamental in providing a solution to an engineering function that the designer can, quantitatively and qualitatively, determine whether a design solution will perform as required and can compare the relative merits of various alternative design solutions.

Maintenance risk assessment must be carried out in conjunction with the more objective and engineered prediction techniques given in this article. It is possible to provide better design and maintenance. This improved way of working should enhance the quality of building services.

Whereas reliability needs to be tackled at the design stage, availability of the building service is also dependent on the subsequent maintenance of installed plant. Hence, good design is the first line of defence against poor maintainability and, the management of maintenance after the completion of the physical installation is the final line of defence against inadequate functioning of a building.

Provided in the text of my two-part article has been an introduction to an integrated approach using maintenance risk assessment and prediction theories. This approach to the design of building services maintenance is not based on assumptions, experience, rules of thumb and fashionable trends. The techniques discussed by the author should facilitate better building services design and maintenance.

Bibliography

BSRIA Guide (BG3), Business-Focused Maintenance, UK, 2004.

BSRIA Guide (BG9), Choosing building services, UK, 2004.

BSRIA Guide (AG1), Maintenance Programme Set-up, UK, 1998.

CIBSE Guide to ownership, operation and maintenance of building services, UK, 2002.

CIBSE Guide, Section A9 (Estimation of plant capacity), 1979.

Compbell J.D. and Jardine K.C., Maintenance Excellence, Marcel Dekker, USA, 2001.

Lam K.C., Enhanced Building Services Maintenance Planning with Prediction Theory, Joint Symposium on New Challenges in Building Services, Hong Kong, 2005.

Moubray J.M., Reliability-centred Maintenance – RCM II, Butterworth and Heinemann, Oxford, UK, 1997.

Robbins P., Managing risk in device engineering, Health Estate Journal, 27-32, June 2005.

Log in or register FREE to read the rest

This story is Premium Content and is only available to registered users. Please log in at the top of the page to view the full text. If you don't already have an account, please register with us completely free of charge.

Latest Issues