Evaluation of Methods used for Separation of Vibrations Produced by Gear Transmissions

This paper evaluates methods used for separating vibrations produced by a gear transmission from the vibration signal acquired on the gearbox. The paper presents a novel method for evaluating the algorithms used for this separation. The evaluation method takes into account the statistical reliability of the results achieved on multiple sets of signals acquired on the same machine and conditions. The signal separation was applied in order to process data obtained during an experiment carried out with the aim of analyzing the influence of a torque load affecting a gearbox on the vibrations produced by the gear transmission. It is supposed that the vibration characteristics of the gear transmission are strongly affected by the value of the torque load influencing the gearbox shafts. This influence is analyzed using the vibration signal acquired on the gearbox housing. The vibration signal contains significant disturbances, and its interpretation is unclear. The vibration signal generated by the gear transmission can be separated using methods that make it possible to select the valid features included in the signal. Methods for feature selection which implement a systematic search in the state space and methods based on the genetic algorithm were applied. The genetic algorithm poses a robust stochastic global search in the state space that is well suited to deal with nonlinear problems and also shortens the necessary computing time. The evaluation and comparison of the results achieved during the separation process using different methods have to be taken into account. In the case of signal separation, it is important to evaluate differences between the results achieved during particular executions of the separation process performed by the same method on different datasets which were acquired in the case of the same experiment and conditions. Methods with results that vary, or that are different from the results given by other methods, are assumed not to be statistically reliable. It is also necessary to penalize methods leading to results that can vary greatly in some executions according to the scatter data. Conversely, methods that give results varying around the right set of features seem more acceptable. A novel method for rating the statistical reliability of the results has been proposed. This method is essential for methods using a stochastic search in the state space.


Introduction
Vibrodiagnostics is a well-established technique for condition monitoring and for detecting faults in modern rotary machines, e.g. automotive gearboxes.The development of a new design for a rotary machine component is a typical field where acoustic and vibration signals produced by the machine need to be analyzed in order to ensure a longer lifetime and quieter running of the machine component.The properties of the newly designed component are evaluated during experiments that test its reliability and its ability to deal with wide-ranging conditions.
The gear transmission is an important machine component which is still under intensive development.Paper [1] describes the design and a way of testing for gears with a non-standard profile developed at the Czech Technical University in Prague.The suitability of the gear transmission design was investigated in a stress test, in which various levels of torque load forced on the two shafts paired by the tested gears.Besides an estimation of the lifetime, the main aim of the experiment was to discover the influence of the load value forcing on the gear transmission on the vibration exposure of the gears.Paper [2] describes the signal processing method applied in order to analyze the vibrations acquired during this test.This paper evaluates the efficiency that can be achieved by several suitable methods.
Theoretical models of these designs are very complicated and often, as in this case, none is available.Some characteristics of gear transmission vibration exposure are known, but they are too general and not accurate enough to satisfy our objective.The main general feature of gear vibration is that the energy of the vibrations produced by the gears is concentrated mainly around the harmonics of the Tooth Frequency (TF).Tooth Frequency ft can be estimated using equation (1).
where f 1 is the revolution rate of the first gear, n 1 is the number of teeth of the first gear, and f 2 and n 2 denote the same properties of the second gear.TF (shown in Fig. 1) is characterized by the presence of many (sub)harmonics in spectra and their amplitude modulations by some frequencies, such as the repetitive frequencies of the shafts in engagement.
Other characteristic frequencies and their modulations can also occur, e.g. the Hunting Tooth Frequency described in [3].It is complicated to select a few frequencies that are most important for vibration analysis of the inspected gear transmission.It requires mutual comparison of many vibration spectra.This comparison can be performed via a cascade diagram (shown in Fig. 1).The significant features included in the acquired vibrations related to the new gear design are given by the differences between the Power Spectral Densities (PSDs) of gear vibration signals acquired when they are forced by different levels of gearbox load, and between the PSDs of vibration signals acquired when the gearbox is forced by the same load value.The need to compare hundreds of PSDs to discover this influence is a big disadvantage.Therefore, it is convenient to simplify and automate this process.Because it is supposed that the torque load mainly affects the vibrations produced by the

Evaluation of Methods used for Separation of Vibrations Produced by Gear Transmissions
A. Dočekal, M. Kreidl, R. Šmíd This paper evaluates methods used for separating vibrations produced by a gear transmission from the vibration signal acquired on the gearbox.The paper presents a novel method for evaluating the algorithms used for this separation.The evaluation method takes into account the statistical reliability of the results achieved on multiple sets of signals acquired on the same machine and conditions.The signal separation was applied in order to process data obtained during an experiment carried out with the aim of analyzing the influence of a torque load affecting a gearbox on the vibrations produced by the gear transmission.It is supposed that the vibration characteristics of the gear transmission are strongly affected by the value of the torque load influencing the gearbox shafts.This influence is analyzed using the vibration signal acquired on the gearbox housing.The vibration signal contains significant disturbances, and its interpretation is unclear.
The vibration signal generated by the gear transmission can be separated using methods that make it possible to select the valid features included in the signal.Methods for feature selection which implement a systematic search in the state space and methods based on the genetic algorithm were applied.The genetic algorithm poses a robust stochastic global search in the state space that is well suited to deal with nonlinear problems and also shortens the necessary computing time.The evaluation and comparison of the results achieved during the separation process using different methods have to be taken into account.In the case of signal separation, it is important to evaluate differences between the results achieved during particular executions of the separation process performed by the same method on different datasets which were acquired in the case of the same experiment and conditions.Methods with results that vary, or that are different from the results given by other methods, are assumed not to be statistically reliable.It is also necessary to penalize methods leading to results that can vary greatly in some executions according to the scatter data.Conversely, methods that give results varying around the right set of features seem more acceptable.A novel method for rating the statistical reliability of the results has been proposed.This method is essential for methods using a stochastic search in the state space.
tested gear transmission, discovering the significant features included in PSDs enables us to separate the undesirable vibration signals produced by the gear transmission.The tested gear transmission was placed in a gearbox fitted with flow cooling.Because of many technical issues, the accelerometers were placed on the gearbox housing.The amount of background noise and vibration contained in the vibration signal acquired on the housing increases the difficulty of further analysis, and can make the results of further analysis unclear.In this case, it is essential to separate the vibration signal produced by the gear transmission from the background vibration produced by other sources.
Under these circumstances, the gear transmission vibration signal can be separated utilizing methods implementing feature separation based on the dependence of the vibration on a known independent parameter.One big group of these methods applied in our study comprises methods of feature selection.This uses a systematic search in the feature state space.Branch and Bound Feature Selection, Sequential Backward Feature Selection, Sequential Forward Feature Selection, Pudil's Floating Feature Selection (forward), and Plus-L-takeaway-R Feature Selection were applied.Other applied methods were based on the genetic algorithm, which implements a stochastic search in the state space.Methods based on the genetic algorithm are well suited to deal with nonlinear problems and they also support parallel implementation, which shortens the necessary computing time.The Multilayered Iterative Algorithm from the Group Method of Data Handling, and the Group of Adaptive Models Evolution were used.
The results achieved by various methods during the separation process have to be evaluated and compared.The methods need to be evaluated with regard to the ability of the separated part of the vibration signal to retroactively rec-ognize the value of the applied torque load.This aspect was verified using Inter/Intra Class Distance.
Another important consideration should be an evaluation of the statistical reliability of the results achieved by the separation process.For this reason, it is crucial to evaluate the differences between the results achieved during particular executions of the separation process performed by the same method on different datasets acquired in the case of the same experiment and conditions.A proposed method known as "Selection Error Rate on Multiple Datasets" rates the statistical reliability of the results.This is particularly important for methods using a stochastic search in the state space, which cannot guarantee that the same results will be achieved even when the same input data are applied.

Signal processing
As mentioned in section 1, the actual condition of the gear transmission can be described by a vibration level at the frequency given by equation ( 1) and their higher harmonic and subharmonic frequencies.The frequencies are characteristic for a certain gearing and revolution rate.An evaluation of the power of the vibration signal at these frequencies and its changes reveals changes in gearing operational conditions, among others especially the wear or a fault that has arisen on the gearing.The aim of signal processing described here is to recognize the bands in PSD that are most important and also contain most information about the condition of the gear transmission.
The experiment and further signal processing focused on separating the vibration produced by the gear transmission from a mixture of vibrations acquired on the gearbox housing is depicted in Fig. 2. The gearwheels are operated under a defined stress or, more precisely, under a pre-selected torque load during the experiment.The Power Spectral Density (PSD) of the vibration signal was estimated.Then PSD was split into bands.In order to reduce the number of features, the PSD of the signal was represented by the power of the signal inside each band.This set of features was utilized to separate the vibration signal of the gear transmission using methods of feature selection.The set of selected features corresponds to the frequency response of the filter which separates the vibration signal produced by the gear transmission from the rest of the acquired signal.

Methods used for feature selection
This section briefly describes the methods used for separating the vibration signal produced by the gearing.

Branch and Bound Feature Selection
The Brand and Bound Feature Selection (BBFS) method is a method of feature selection designed to select features valid for solving the task from the set of given features.The methods select the features that carry the most information in the sense of the selected criterion function.BBFS works on the basis of a systematic search in the feature state space by creating a decision tree using the "Depth First Search with a Backtrack Mechanism" [4].This is a recursive algorithm which is initialized with the complete set of given features and a corresponding value of the criterion function.In the first step, it is removed the feature whose removal causes a minimal decrease (or even an increase) in the criterion function.Then all valid branches of the decision tree are sought in dependence on the value of the criterion function.The algorithm continues recursively till the optimal set of features is found.A detailed description of the Branch and Bound method is given in [4].

Sequential Forward or Backward Feature Selection
Although the BBFS method can find the optimal solution very effectively, its computation can be quite demanding.Feature selection methods are therefore used that can find a suboptimal solution only, but their computation is less de-manding.Sequential Forward Feature Selection (SFFS) and Sequential Backward Feature Selection (SBFS) are two of these methods.Both methods also systematically search the feature state space while creating the decision tree.Unlike BBFS, these methods only search the bounds of the tree that provide the least decrease of the criterion function [4].Unlike BBFS and SBFS, SFFS starts with the empty set and proceeds by adding the features one after another.

Plus-L-takeaway-R Feature Selection
SFFS and SBFS both work without a backtracking mechanism.Once a feature is added (removed), this action cannot be undone.Plus-L-takeaway-R Feature Selection (LRFS) (also known as (l, r) search) [4] is derived from the sequential forward selection algorithm and gets around this lack by adding l features at a time, and after that r features from the obtained set are excluded in accordance with the criterion function, etc.This algorithm results in better performance than sequential selection.

Pudil's Floating Feature Selection
Pudil's Floating Feature Selection (PFFS) is also derived from SFFS.It implements the "floating search algorithm" described in [5].Pudil's Floating Feature Selection supports backtracking as long as there is an increase in the criterion function which may not be monotonic.PFFS is believed to give results similar to those given by Branch and Bound, but it needs far less computation effort [6].

Group Method of Data Handling
The Group Method of Data Handling (GMDH) is a set of several methods for constructing inductive models [7].This approach is based on gradually sorting out complicated models and selecting the best solution on the basis of the minimum of external criterion.This leads to the selection of valid features that are able to describe the analyzed influence in the data.
The vibration signals acquired on the gearbox were processed by the "Multilayered Iterative Algorithm" (MIA or MIA GMDH).This approach uses a data set to construct a model of a complex system.The model is represented by a neural network which has been trained using the genetic algorithm.The genetic algorithm not only adjusts the network, but also has an influence on the network topology.
The MIA algorithm works as follows.First the initial population of units with a given polynomial transfer function is generated.The units have two inputs and therefore all pair-wise combinations of input variables are employed.Then coefficients of unit transfer functions are estimated using stepwise regression or some other optimization method.The units are sorted by their error of output variable modeling.A few of the best-performing units are selected as inputs for the next layer according to the rules of the genetic algorithm.The next layers are generated identically until the error of modeling decreases.The units that are connected to features carrying most of the information provide the best results, and so they should with higher probability survive in a network.The fitted MIA GMDH neural network describes the solved problem via polynomial equations [7].

Group of Adaptive Models Evolution
The Group of Adaptive Models Evolution (GAME) [8] is derived from GMDH theory.It improves the Multilayer Iterative Algorithm (MIA).The GAME method uses the niching genetic algorithm [8] to build networks with neurons and connections proper to the data set.The connecting can be more complex than MIA provides, and several types of neurons are possible.
GAME also contains a technique for verifying the models.A selected number of models are created during the training phase.The inner structures of all models are compared, and possible correlations in inner structure are penalized (danger of possible creation of similar models).Subsequently, the models obtained during the training phase are compared all together and also to the known right values (right answers).This results in the selection of a few best models and simultaneously the credibility of the models for each value of a known parameter is determined.
It has been proven that GAME networks are able to solve a certain type of complex problems that cannot be solved using MIA GMDH [9].The main disadvantage of using GAME is the higher computing severity.

Evaluation of methods used for feature selection
The efficiency of the methods described in the section 3 was evaluated using Inter/Intra Class Distance and Selection Error Rate on Multiple Datasets.

Inter/Intra Class Distance
Because it is assumed that the torque load mainly affects the vibrations produced by the tested gear transmission, the significant features are related to differences between their values for the different load.Conversely, the differences should be small in the case of the same load.Hence the load defines classes in the feature space.The values of features for the given load value form a cluster.
Inter/Intra Class Distance (IICD) evaluates the ability of selected features to form clusters, and the ability of the clusters to be easily discriminated.The IICD criterion was calculated using the following equations: ( ) ( )

s s s s z s z s
where N S is the number of samples, K is the number of classes, N k is the number of samples included in class k. s k denotes the centre of class k. s is the centre of all samples z n .z n denotes sample n, z k,n denotes sample n belonging to class k.The sample is presented by the set of features.

Selection Error Rate on Multiple Datasets
Selection Error Rate on Multiple Datasets (SERMD) is a novel method designed mainly for evaluation of separation using a stochastic search.When applied to methods using a systematic search, SERMD evaluates the sensitivity of the search when scattered data is applied.SERMD is based on analyzing the differences between the results (set of selected features) achieved during single executions of the separation process done by the same method on different datasets acquired in the case of the same experiment and its settings.
The presumption behind the idea of SERMD is that the results (selected features) given by the search should be the same or similar when it is applied to the data acquired during the same experiment under the same conditions (the same experimental settings).Methods producing results that vary, or even results that are stochastic, are assumed not to be statistically reliable.It is also necessary to penalize methods that provide results that can be totally different for some executions depending on the data.On the other hand, methods that give results varying around the right set of features seem to be more acceptable.
SERMD analyzes the sets of Z features and their ratings given by method m when it is applied to N datasets.The set of ratings of features selected when applied to dataset n is denoted as z(n, m).z(n, m) is a vector of length Z.The element of z(n, m) is z(n, m, f i ), where f i denotes the index of the selected feature (further denoted as selected feature) (matches the center frequency f i or index i of the selected band).z(n, m, f i ) responds to the rating of feature f i by method m when it is applied to dataset n.When the method does not give the feature rating, the rating is set subsequently.If the feature has been selected, the rating is given by probability with uniform distribution for all the selected features.If it is not selected, the rating is zero.
First, the Most Rated Features (MRFs) given by all methods are estimated.The MRFs are given by the histogram of ratings of all features through all methods and all datasets available (equation ( 5)).

P f z n m f
. ( The Most Rated Features for all methods f E are given as follows: where the transformation "argmax" gives a vector of E features assigned by E maximum values included in P(f i ) reflecting all features f i .The vector of MRFs given by all methods f E forms a standard to which all methods are compared.
The evaluation of method m starts with estimating MRFs which are given only for this method and dataset n.Each vector of these MRFs is estimated by equation (7).
Then the matrix D m formed from differences between MRFs f(n, m) given by method m when applied to dataset n and the standard MRFs f E given by all the methods and datasets.Matrix D m is created subsequently by stacking the differences d(n, m) which form columns of the matrix (equation ( 8)).
The efficiency of the method is estimated by the values of E minimum values included in D m according to equation (9).
where the transformation "argmin" gives a vector of E value assigned by E minimum values included in D m .
The value of the SERMD criterion is given by the mean absolute error of the selected features according to equation (10).
where d E (m, e) is a value included in d E (m).
The multiple datasets are created by splitting and shuffling the acquired dataset.Each class (torque load value) should be represented equidistantly by the same number or a similar number of samples.

Testing stand
An universal testing stand was used to test the gearwheels.The stand is designed to abridge the lifetime of the gearwheels.The design of Niemann's closed loop circuit [10] was used for this purpose because of its lower energy intensiveness.The testing stand, shown in Fig. 3 and Fig. 4, consists of one measured gearbox and one auxiliary gearbox, a drive engine, tension equipment, and sensors for torque load, shaft revolution rate, temperature and vibration of the tested gear transmission.
Unlike the testing gearbox, the auxiliary gearbox is overrated for the values of the torque load.The torque sensor works up to 2000 Nm.The circuit is dimensioned for maximal virtual power 785 kW and for a revolution rate of 1450 rpm.f t was 398 Hz.

Data acquisition and sensor placing
The accelerometer was placed at a point located on the bearing housing above the shafts (shown in Fig. 2 as 2H).The vibrations were acquired using a Brüel & Kj r PULSE 7537 analyzer fitted with calibrated 4507 B accelerometers.
The attachment of accelerometers enables operation up to the upper limit frequency at 3 kHz with sensitivity 10 mV/ms

Digital signal processing
The measured vibration signal was filtered by a low-pass filter with the stop frequency at 3 kHz.The power spectral density was estimated using Welch's method.The Hanning window in time was applied.Welch's segment length was selected as one hundredth of the whole signal length.The overlap value was 25 %.The PSD of the acquired vibration was divided into uniformly spread bands with constant bandwidths at 20 Hz.The division performed in such a way that TF were in the center of each band.The vibration in each band was represented by its power value.For feature selection, each measurement of the data set was represented by a set of 150 features.
The vibration signals acquired during the stress test contained 5 classes formed by different states of the torque forcing on the gearing.The dataset contained 192 records: 40 records for torque at 0 Nm, 38 records for 500 Nm, 38 records for 1000 Nm, 38 records for 1500 Nm, and 38 records for 2000 Nm.The record order was shuffled using a random process with normal distribution.10 most rated features were selected for each dataset and method.

Experimental results
A comparison of the results achieved by the methods is shown in Table 1.
The values of SERMD stated in Table 1 are given for evaluating 2 datasets.Each dataset contained 96 samples of measurement.The 10 most rated features were selected for each dataset and method (E =10).The results given by SERMD are strongly dependent on the size of the dataset used for feature selection.This dependency corresponding to each method is shown in Fig. 5.The dependency shown in Fig. 5 is an approximation of the dependency which can be estimated by repetitive estimation on datasets created by random shuffling.
According the computing severity of each method, the required computing time can be interesting.Fig. 6 shows the maximum and minimum computing time that was needed to select the features.The values stated in Fig. 6 were estimated on a PC fitted with an Inter Pentium M processor operated at 1.73 GHz, 533 MHz FSB, and 2 GB RAM.Because of the way in which the times were estimated, the values stated in Fig. 6 are informative, but they provide an overview of the computing severity of the methods.
An overview of Power Spectral Density of vibration and its evaluation by Group of Adaptive Models Evolution is shown in Fig. 7.

Conclusion
This paper has focused on an evaluation of feature selection applicable methods.The novel method presented here, Selection Error Rate on Multiple Datasets, takes into account the statistical reliability of the results achieved when the methods were applied repeatedly to multiple sets of signals acquired on the same machine under the same conditions.This is particularly important in the case of methods based on a stochastic search in the state space.When this criterion was applied to methods based on a systematic search, it evaluated the sensitivity of the search when scattered data are applied. The

Fig. 1 :
Fig. 1: Cascade diagram showing the dependency of vibration power spectra density on the torque load forcing on the gear transmission

Fig. 2 :
Fig. 2: Experiment used for separating the vibration produced by the gear transmission from a mixture of vibrations measured on the gearbox housing (2H denotes the sensor placing)

Fig. 5 :Fig. 6 :Fig. 7 :
Fig. 5: Dependency of SERMD on the size of the dataset used for feature selection

Table 1 .
methods for signal separation were applied in order to separate the vibration signals produced by a new design of Comparison of methods used for feature selection