Risk Analysis and Reliability Improvement in Spark Ignition Engines Associated to Sporadic Failures J. C. Salamani Polytechnic School, University of São Paulo, Brazil
G. F. M. de Souza Polytechnic School, University of São Paulo, Brazil
ABSTRACT: Internal combustion engines are sometimes submitted to sporadic failures. Such failures are considered very difficult to find out and to diagnose. The increase in system robustness against such sporadic failures which affect exhaust gas emissions, engine performance and reliability are fundamental to attain engine and vehicle reliability. A methodology based on risk and reliability analysis concepts is presented to identify the root cause of such sporadic failures. The sporadic failures to be analyzed are selected based on warranty reports database considering the criticality of failure consequences. The method is applied to evaluate sporadic failures of a sub-compact vehicle platform commercialized in Brazil. The method allows identifying manufacturing and assembly processes of electro-electronic devices that are main causes of some sporadic failures.
1 INTRODUCTION The probabilistic risk analysis is a procedure to investigate complex systems considering the consequences of an undesirable phenomenon. Modeling allows understanding how different variables interrelate besides identifying the contribution of each component to the risk and integrity of the whole set regarding an undesirable behavior. Some questions are part of the evaluation and are of high relevance, which are (Aven 2008): i. What may go wrong and lead to a failure or undesired effect?; ii. What is the probability of this event to occur?; iii. In case it occurs, what is the consequence expected?, and iv. Is it possible to evaluate the effects on health, safety, legislation, environment, cost and corporate image? Special attention is given here to the frequency result versus failure consequence. The main conclusion is not the system failure probability, but rather determining which subsystem should be doubly focused on in terms of optimization and operational deficiency aiming at reducing failures consequences. The risk-based analysis of failures in electromechanical systems is useful to define the most critical components taking in view the failures consequences for the end user. This analysis allows identifying failures that even with low probability of occurrence causes great impact on equipment
performance deeply affecting end user opinion about the equipment. A methodology based on risk and reliability analysis concepts is presented to identify the root cause of sporadic failures that may occur during the use of internal combustion engines used in passengers cars. The methodology allows evaluating the current product design and manufacturing process situation, as well as failures that will potentially affect combustion efficiency, emissions and vehicle robustness. The objective is to increase the ignition system robustness by detecting sporadic failures and correcting them to create a more reliable power train system. This is a methodology to attain a constant process improvement as well as to minimize risk in current production products. The method is applied to evaluate sporadic failures of a sub-compact vehicle platform commercialized in Brazil. 2 METHODOLOGY Risk analysis usually begins by revising the goals, the target one intends to and can reach when concluding the analysis. Different conclusions can be unfolded into actions, such as design improvement, defining the acceptable risk level, adequacy to the legislation, regulatory standards and subsidies for managing the lifecycle. The method is
based on concepts proposed by Guimea & Coffelt (2008), Bertsche et al. (2008), Kumamoto & Henley (1996) and Modarres (2003). 2.1 Surveying information The general and deep knowledge of the system as a whole should be part of the analysis. Factors such as designs, processes, controls, maintenance, test procedures and subsystems adequacy, will certainly influence the final decision of the risk study. The aim is to prevent failures, minimizing damages and exposure to risk. In the information survey and collection, data should be analyzed regarding:
Emergency systems, safety and factors related to human failures. Integration and interface among the subsystems and components should be identified and known. Failures, events observed in the past, field history should be available. Solid bases and foundation should be included at this stage of the study
Still at that stage, the simultaneous support by designers, users and analysts should guide and determine the analysis focus and object. Conditions and failure modes to be included or discarded should be determined. The risk analysis is valid for designing that specific moment and for the focus placed on a given risk. 2.2 Identifying initiating events This part of the analysis involves identifying potential abnormal events leading to exposure to risk. One or more operating modes leading to this undesired final condition should be identified in the system. This step involves identifying events that may lead to or pose a threat to the integrity of the system. During the normal working period, something leads the system to undergo a transition towards an undesired event. The loss of a function or system, or several of them simultaneously, is considered an initiating event. A way of determining the initiating event is to analyze the functional diagram of the systems involved. Each function should be decomposed into its subsystems identifying the events that may result in failures. These events may be grouped per functional similarity and treated as an initiating group. The procedure below may be adopted to subsidize and to analyze risks:
Using a list of initiating events Identifying a group of initiating events
Grouping the initiating events presenting the same effect.
2.3 Developing the scenario This step aims to develop a full scenario leading to exposition to risk. Failure tree is a technique that can be used to better understand the factors. In it, subsystems are presented, including hardware and software failures. Internal and external events should be considered in the analysis process, defining the following points:
Identifying the functions of each initiating system or their group Associating the human aspects, hardware involved. Developing a tree of events or groups of events.
2.4 Survey into failure data and performance evaluation The analysis of failure database usually greatly helps to improve or to define future problems or risks. Hardware, software and human aspects, among others, are input data to evaluate possible barriers for failure propagation. Field history data or past data may be grouped as generic failure data. The failure tree should basically analyze three types of occurrence:
Initiating event Components and systems involved Human factors.
This analysis is based on time to failure occurrence, number of repairs made and frequencies of occurrences. Also to be known are the uncertainties associated to all these data, reflecting limitations of the available information. For attaining a low risk level, systems and equipment should have a high performance level regarding reliability. The aim is to obtain systems adequately designed, presenting reliability and low uncertainties. Reliability is defined as the ability on an item to perform a required function under given conditions for a given time interval. The following aspects can considered at this stage:
Determining values of durability and material strength, loads, and aggressive agents, number of failures, occurrences and failure rates. Gathering all the data, repair data, barriers, deriving from existing experiences.
Access to the frequency of initiating events and to other failure probabilities deriving from the experience with the system. Determining common causes for similar components
2.5 Quantification and integration The failure tree and the failure events are integrated and their events are quantified to know the frequencies involved and associated to uncertainties. Quantification usually involves a Boolean representation process. Each event will have its probability associated and will compose the general Boolean expression.
2.9 Results interpretation When concluding the risk values, the results are interpreted to determine whether revisions or reevaluations are necessary to refine conclusions. It is necessary to characterize the importance of each system element in the final result. Therefore, the following steps for analyzing the result are to be taken.
2.6 Analysis of uncertainties Uncertainties are an integral part of estimation. It is sometimes common to neglect the uncertainties associated to the theoretical model used to estimate the probability of occurrence of undesirable events (epistemic uncertainty) and the limitations of the available information used to execute statistical analysis (statistical uncertainty). In decision-making, analyzing the uncertainties is essential. The failure of a system may be represented by an association of probabilities. The failure and the event tree used to estimate the risk may contain a source of uncertainty. Another cause for uncertainty is that the model may be incomplete or the detail level in the subsystems decomposition may be insufficient or even present a lack of important considerations. 2.7 Sensitivity Analysis At this stage, it is necessary to verify the importance of the change in the choice of the model or of parameters and which analysis elements are the most sensitive; to modify the contribution of each value of the analysis inputs; to verify the impact of changes on the main elements 2.8 Classification of risks and of the importance of the analyses Classifying the elements and systems regarding their risk or safety is one of the most important results of PRA (Probabilistic Risk Analysis). This means simply classifying the systems as regards their final contribution to total risk. The selection process should be very carefully conducted, minding the following:
Determining the aim of classifying the systems. Classifying the risk.
Identifying the most critical and important elements regarding total risk.
Determining the logical model, scenario structures, hypotheses and goal of PRA. Identifying system elements for which better information would be necessary to reduce uncertainties. Revising the analysis and reinterpreting the results.
3 CASE STUDY – INTERNAL COMBUSTION ENGINE 3.1 Failures and associated damages The main focus of the present paper is to analyze Combustion Failure leading to turning on the EPC (Electronic Power Control) alert light, without entering the safety mode of the vehicle. In the safety mode (there are several safety modes), the electronic mode limits or restricts the engine rotation and speed. The vehicle is restricted to, for example, an 80-km/h speed or to a pre-determined RPM value. In a stricter safety mode, restriction increases and the engine does not exceed a very low rpm value, enough to maneuver the vehicle to a safe place. In case of a combustion engine failure without entering the safety mode an emission level limit may be unrespect. Moreover, the protection systems may automatically shut down a cylinder of the engine to protect the catalyzer or to ensure emission levels, thus causing a loss of power. These situations are extremely undesirable, as they make the occupant stop the vehicle at a place or at an instant which might not always be adequate. That is, the minor damage is entering the safety mode due to a combustion failure and, at a lesser proportion, repetitive repair before finally discovering the failure. Besides, from the manufacturer point of view, and abiding by the Brazilian consumer code, a failure has to be solved in 30 days, be it a continuous or sporadic failure, considering the total time the vehicle cannot be used. If some critical injection components fail, they lead to a compulsory and immediate interruption of the
production line. The external scenario is thus the legislation ruling the emissions of seriesmanufactured vehicles. A drop in quality of a component affecting emissions is a reason for hindering production. A failure in field of something affecting emissions causes the interruption of production. 3.2 The risk evaluation method According to sections 2.2 and 2.3 of the methodology this section presents the combustion engine analysis aiming at identifying the components which failures may lead to the critical event, the combustion failure. The failure tree or matrix describes the events leading to sporadic, difficult-to-detect failures and that may expose the occupant to a safety mode situation in its different severity levels. The engineers’ judgement is fundamental to classify and to determine the critical systems. The judgements made at this stage are the greatest uncertainty. The internal combustion engine, associated to an automotive vehicle, may present sporadic, difficultto-diagnose failures. The failures are initially divided into 2 groups: the first being the mechanical ones and the second the electrical failures including those of the engine electronic management. Both groups affect the vehicle performance and are distributed in about 65% for the electrical group and 35 % for the mechanical one, based on 10-year field studies. Particularly for the combustion effect failure, the failure tree shows the sequence of undesirable events which trigger the final undesirable failure, which is the combustion failure. The mechanicaltype failures happen to be easily identifiable. The dimensional analysis of components, specific tests, visual analysis, metallographic analysis and others allows an easy detection of the failed component. Figure 1 shows the technical schema of the systems involved in the generation of the undesirable event analyzed in the present study. The failure modes were divided into six groups of initiating events, including hardware, software and human failure, as follows :
Complete engine – cylinder head, injectors, throttle body, catalyzer, engine block, intake manifold, cooling system, mechanical timing, ignition coil, lambda sensor, spark plugs, cables and phonic wheel, as shown in Table 1. Fuel and Lubrication – Quality of the fuel, lubricant oil, and cold start reservoir, as shown in Table 2. Interface with the vehicle – Battery, immobilizer, generator and starting motor, as shown in Table 3.
Figure 1. Combustion Engine with all of its peripherals.
Cold start system – Cold start pump, canister, as shown in Table 4. Fuel pumping – pump and tank, as shown in Table 5. Electronic control unit – Hardware, software, as shown in Table 6.
To continue the risk analysis aiming at defining the initating events of the sporadic failure under analysis, according to section 2.4, a 10-year warranty database was analyzed. That database provides information about the initaing events associated with failures occurred during warranty period. The database allows to evaluate the frequency of occurrence of the initiating events. Each group the possible failure modes are classfied according to their criticality (using two levels: critical and important), detectability (using two levels: detectable and difficult to detect) and frequency of occurrence. The components failures classified as sporadic failures have a frequeucy of occurrence lower than 0,5.10-3 per vehicle. The 10-year warranty and field data analysis shows that the mechanical items failures are of easy detection or detectable by immediate methods of failure analysis. The technical assistance centers would not have difficulty in detecting them. Nevertheless, some electrical failures are difficult to detect and are critical as for engine performance. The analysis proposed by itens 2.5 and 2.7 of the methodology were not applied in the present analysis once the warranty database allows the direct definition of the failures root-causes eliminating the use of theoretical models to evaluate frequency of failure. The aspects cited in itens 2.5 and 2.7 are mainly related to theoretical analysis were models based on reliability concepts must be developed to define the frequency of occurrence of a failure mode.
x x x x
x
x x
x x
x x x x
Sporadic
x x x x
Dificult to Detect
Detectable
Fuel Pump Canister Temperature Sensor Cold Start Reservoir
Important
Cold Start
Critical
Table 4. Cold start and refrigeration separated per criticity and probability of easy detection.
3.4 Uncertainties Analysis According to section 2.6 of the methodology, the following uncertainties can be considered in the previous analysis:
Table 5. Feeding and Fuel Pumping separated per criticity and probability of easy detection. Fuel Feeding
Table 7. Final summary of the analysis separated per criticality and probability of easy detection. Sporadic
x
Sporadic
x x
Dificult to Detect
Detectable
Batery Imobilizer Generator Starter Motor
Important
Vehicle Interface
Critical
Table 3. Interface with the vehicle separated per criticity and probability of easy detection.
Dificult to Detect
x x
Sporadic
Detectable
x x x
Dificult to Detect
Important
Critical
Table 2. Fuel and Lubricant separated per criticity and probability of easy detection.
Detectable
x x x x x
This analysis corresponds to section 2.8 of the methodology. Table 7 shows a summary of the risk analysis. The field data show that 60% of the engine failures derive from electrical problems, out of which about 3.5 % are repetitive and less than 1% are difficult to diagnose and of sporadic characteristic, or lead to the different protection modes. This indicates that the probability of failure is very small, but the consequences are extremely important, such as: repetitive repairs, entering the safety mode or diagnostic complexity. The analysis also indicates the only way to eliminate the problem is to make the systems more robust, leading them to a zero failure condition, for as small as the probability may be, since the consequence is serious and the risk is always critical.
Important
x x x x x
Fuel at the Tank Cold Start Reservoir Lub Oil Quality
x x
3.3 Results
x x x x x
Fuel and Lub Oil
x x
Critical
x x x x
Hardware Software
Detectable
x
Eletronic Unit Control
Important
x x
x x
Table 6. Electronic control unit (software and hardware) separated per criticity and probability of easy detection. Critical
Table 1. Engine components failures separated per criticality and probability of easy detection.
The general model of components functional links is the most adequate? What is the accuracy of the failure history database?
The functional links of the combustion engine components considered in the analysis are based on technical reports developed during engine design. So the uncertainties associated with engine functional
links modeling and failure modes and effects can be disconsidered. The accuracy of the database depends on the information collected by the authorized dealers. Usually that information is also checked by the carmaker quality engineering department to verify possible incorrect interpretation of the registered failure mode and cause. So the database accuracy can be considered high. The risk analysis is sufficiently accurate to support improvements in product design or manufacturing process to reduce sporadic failures occurrence, according to section 2.9 of the methodology. Those improvements are presented in section 4. 4 IMPROVEMENTS DERIVING FROM THE PRODUCT RISK ANALYSIS The results of product risk analysis led to several improvements and developments both in product and in manufacturing processes. The product design was improved by the use of a new concept for the set of the ignition system. This new concept eliminates a component which involved a complex manufacturing and assembly process, which is the ignition cable, as shown in Figure 2. Hence, the ignition coil started to be individual and directly coupled to each each ignition spark plug set, also eliminating the ignition wire. Thus, if the failure happens to occur, it is restricted to one cylinder of the engine only, since each cylinder has its individual coil. Besides, the nearness of the coil and the plug eliminates possible losses of efficiency in power conduction. The elimination of the ignition cable also eliminates several processes which may affect the reliability of the set, especially interface failures among the cable, spark plug and ignition coil. The new set has great robustness in comparison with the former one.
The electrical ends connecting the coils were made more robust improving the contact between the coil and the cable. New additional electric tests were implemented at the end of the assembly line to check the coil operation. The welding processes were revised to avoid the so-called cold welding. For that process the intervals for replacing the electrodes were reduced, parameters and processes were revised, reducing the readjustment intervals in the welding machines. Regarding coil resin, process improvements were made to improve resin deposition followed by a special maintenance plan. Improvements were also made in the injection processes of the plastic parts to prevent barbs, which may hinder the coiling processes. Procedures were created to prevent wire fragments from falling into the manufacturing processes generating failures in coils at work. 5 PROPOSAL FOR ACCELERATED TEST BENCH DESIGN Once the critical components, their failure modes and the main characteristics involved in their manufacturing processes are known, a packet of test benches are proposed to simulate and to research possible failures that occur sporadically and are difficult to diagnose as shown in Figures 3 and 4. The tests have to be accelerated to provide a continuous reliability evolution program, facilitating the management of its evolution through improvement actions (Pantelis & Adamantios 2008). For each of the bench tests, the main sporadic failures to be analyzed must be identified and can support future changes in components design. The periodical execution of those tests will support the component reliability growth reducing the frequency of failure during automobile operational period.
Figure 2. Modification in the product resulting from the product risk analysis.
Added to that, several manufacturing process reliability improvements were performed.
Figure 3. Accelerated test bench for ignition coils.
terminals show to be in good conditions; however, an internal broken cable, which causes the ignition failure, can be observed, as shown in Figure 5.
Figure 4. Bench for testing spark plugs and plug wires.
Figure 5. Broken cable; failure occurs at high temperatures.
6 RESULTS OF TESTS
6.2 Failure in the welding process
For the scope of this work, the ignition system comprises the ignition coil, spark plug and connection cables. The proposed tests, as well as the equipment built, are able to detect sporadic, difficult-to-diagnose failures within the electric set of the engine. The next sections present examples of some failures studied and identified in recent years and were detected using similar benches to those designed and presented in Figures 3 and 4. Those results supported the design improvements presented in section 4. The proposed tests benches will allow the reproduction of those tests and will support future studies of physics of failure.
Welding processes are some of the most important to guarantee the continuity of the electric systems. Because some welding failures cannot be detected when cold by non-destructive inspection, failure may occur at different temperatures or voltages, and are the most difficult to diagnose, as shown in Figure 6. The failure may be detected only at high temperature or only at high voltages, being of sporadic character and considerably difficult to diagnose.
6.1 Failures in the ignition coil The main sporadic and difficult-to-diagnose failures deriving from the ignition coil are described. The ignition coil is very sensitive to manufacturing process failures (Anderson 2007). Aspects concerning crimping, welding and plastic injection are determinant in the robustness of operation. The failures correction demand operation processes and design reliability improvements. These parts are object of years of field research and increase in product and process robustness. Robustness actions are taken to reduce failure occurrence. Internal failure in the primary coil is a type of failure not easily identified under normal temperature; it is better detected when exposed to heat. The part undergoes the functional test at ambient temperature, in which it presents functional values according to specifications. When the functional assay and the electrical test are performed with a part stored in an oven for 1 hour at high temperature (100°C), the part presents ignition failure. The analysis of the primary coiling allows verifying the break of the primary coil wire close to its coupling terminals. The contact
Figure 6. Failure found in the transistor welding.
7 SUGGESTION OF RELIABILITY MANAGEMENT PROGRAM Product design and development, as robust as can be, cannot assimilate all the existing variables influencing reliability. This requires a wide and specific program for increasing reliability that supports the design and development areas, and especially the manufacturing processes. Future supply contratcs should include reliability clauses and admissible failure rates, and will define product performance requirements in field. Reliability
control should occur at each individual stage of a product life cycle. 7.1 Process Reliability Management Program Along the production, the processes reliability must be ensured and audited periodically. Failures occurring during production are of great interest and are the foundation for process reliability evaluation and improvement. There must be developed a database regarding process failures effects on product performance. Once process failure is detected some product samples must be collected and submitted to tests in the benches presented in section 5. Those tests will allow defining product performance taking in view the occurrence of manufacturing process failure. 7.2 Product Reliability Management Program The product reliability improvement must be achieved based on test bench results but also on failures detected on field. Information regarding equipment usage (expressed in km) at the time of failure can be used to evaluate reliability and to check correlation between test bench and field data reliability prediction. Those results can be used to improve product reliability. 8 CONCLUSION A methodololy based on risk analysis concepts is presented to evaluate failures in spark ignition engines. The methodology is aplied to evaluate sporadic failures of a sub-compact vehicle platform commecialized in Brazil. The major conclusion is that the probability of occurrence of sporadic and difficult-to-diagnose ignition failure is very small, yet its consequences are extremely important. In practice, the consequence of this type of failure tends to increasingly grow due to the influence of different factors, such as market strength, legislation and/ or image. It also indicates that the only way to eliminate the problem is to make the systems involved more robust, leading them to a zero failure condition. After all, as small as the probability may be, once the consequence is harmful, the risk is always critical. That is, throttle body, coil, lambda sensor, plug wires, ignition spark plugs and magnetic sensors have to be classified as components requiring the maximum level of control and reliability. In practice, processes involving welding, insulated wire coiling, crimping, end connections, rubber injection or insulation are sensitive processes which need strict process parameters, once they lead to failures of the sporadic type.
In order to improve product reliability test benches are proposed to execute acccelarated life tests aiming at defining time to failure of those critical components. The tests benches are able to execute tests that will allow the definition of time to failure pobability density function of typical failures analyzed in this study. The results of those tests will subsidize the development of a reliability management program for those critical components. REFERENCES Anderson, A. F. 2007. Reliability in Electromagnetic Systems: The Role of Electrical Contact Resistance in Maintaining Automobile Speed Control System Integrity. Aven, T. Risk Management. 2008. Handbook of Performability Engineering University of Stavanger, Norway. Bertsche, B. ; Schauz, A. ; Pickard, K., 2008. Reliability in Automotive and Mechanical Engineering - Determination of Component and System Reliability Guimea, S. D.; Coffelt, J. P. 2008. Modeling Count Data in Risk Analysis and Reliability Engineering.Texas A&M University, USA. Kumamoto, H.; Henley, E. 1996. Probabilistic Risk Evaluation and Management for Engineers and Scientists. Modarres, M. 1993. What Every Engineer Should Know About Reliability and Risk Analysis. New York: Marcel Dekker. Pantelis, V.; Adamantios, M. 2008. Quantitative Accelerated Life-testing and Data Analysis.
Comments
Report "Risk Analysis and Reliability Improvement in Spark Ignition Engines "