A novel information-theoretic stepwise feature selector (ITSFS) is designed to reduce the dimension of diesel engine data. This data consist of 43 sensor measurements acquired from diesel engines that are either in a healthy state or in one of seven different fault states. Using ITSFS, the minimum number of sensors from a pool of 43 sensors is selected so that eight states of the engine can be classified with reasonable accuracy. Various classifiers are trained and tested for fault classification accuracy using the field data before and after dimension reduction by ITSFS. The process of dimension reduction and classification is repeated using other existing dimension reduction techniques such as simulated annealing and regression subset selection. The classification accuracies from these techniques are compared with those obtained by data reduced by the proposed feature selector.

1.
Bell
,
C. B.
, 1962, “
Mutual Information and Maximal Correlation as Measure of Dependence
,”
Ann. Math. Stat.
0003-4851,
33
, pp.
587
595
.
2.
Cover
,
T.
, and
Thomas
,
J.
, 1993,
Elements of Information Theory
,
Wiley
,
New York
.
3.
Hyvärinen
,
A.
, and
Oja
,
E.
, 2000, “
Independent Component Analysis: Algorithms and Applications
,”
Neural Networks
0893-6080,
13
(
4–5
), pp.
411
430
.
4.
Huber
,
P.
, 1985, “
Projection Pursuit
,”
Ann. Stat.
,
13
(
2
), pp.
435
475
. 0090-5364
5.
Schwarz
,
G.
, 1978, “
Estimating the Dimension of a Model
,”
Ann. Stat.
0090-5364,
6
(
2
), pp.
461
464
.
6.
Mallows
,
C.
, 1973, “
Some Comments on Cp
,”
Technometrics
0040-1706,
15
(
4
), pp.
661
675
.
7.
Akaike
,
H.
, 1974, “
A New Look at the Statistical Model Identification
,”
IEEE Trans. Autom. Control
0018-9286,
19
(
6
), pp.
716
723
.
8.
Peng
,
H.
,
Long
,
F.
, and
Ding
,
C.
, 2005, “
Feature Selection Based on Mutual Information Criteria of Max-Dependency, Max-Relevance, and Min-Redundancy
,”
IEEE Trans. Pattern Anal. Mach. Intell.
0162-8828,
27
(
8
), pp.
1226
1238
.
9.
Warner
,
B.
, and
Misra
,
M.
, 1996, “
Understanding Neural Networks as Statistical Tools
,”
Am. Stat.
0003-1305,
50
(
4
), pp.
284
293
.
10.
Hastie
,
T.
,
Tibshirani
,
R.
, and
Friedman
,
J.
, 2001,
The Elements of Statistical Learning: Data Mining, Inference, and Prediction
,
Springer
,
New York
.
11.
Vapnik
,
V.
, 1999, “
An Overview of Statistical Learning Theory
,”
IEEE Trans. Neural Netw.
1045-9227,
10
(
5
), pp.
988
1000
.
12.
Jain
,
A.
,
Duin
,
P.
, and
Mao
,
J.
, 2000, “
Statistical Pattern Recognition: A Review
,”
IEEE Trans. Pattern Anal. Mach. Intell.
0162-8828,
22
(
1
), pp.
4
37
.
13.
Stoica
,
P.
, and
Selen
,
Y.
, 2004, “
Model Selection: A Review of Information Criterion Rules
,”
IEEE Signal Process. Mag.
1053-5888,
21
(
4
), pp.
36
47
.
14.
Liu
,
H.
, and
Motoda
,
H.
, 1998,
Feature Extraction Construction and Selection: A Data Mining Perspective
, 2nd ed.,
Kluwer Academic
,
Boston
.
15.
Hand
,
D.
, 1997,
Construction and Assessment of Classification Rules
, 2nd ed.,
Wiley
,
Chichester
.
16.
Gordon
,
A.
, 1999,
Classification
,
Chapman and Hall/CRC
,
Boca Raton
.
17.
Deignan
,
P. B.
,
Meckl
,
P. H.
,
Franchek
,
M. A.
,
Zhu
,
G. G.
, and
Jaliwala
,
S. A.
, 2000, “
The Mutual Information-Radial Basis Function Network
,”
Intelligent Engineering Systems Through Artificial Neural Networks
,
ASME Press
,
New York
, Vol.
10
, pp.
83
90
.
18.
Vasconcelos
,
N.
, 2003. “
A Family of Information-Theoretic Algorithms for Low-Complexity Discriminant Feature Selection in Image Retrieval
,”
Proceedings of International Conference on Image Processing
, Vol.
3
, pp.
741
744
.
19.
Al-Ani
,
A.
, and
Deriche
,
M.
, 2001. “
An Optimal Feature Selection Technique Using the Concept of Mutual Information
,”
International Symposium on Signal Processing and its Applications
, Vol.
2
, pp.
477
480
.
20.
Scott
,
D. W.
, 1992,
Multivariate Density Estimation: Theory, Practice, and Visualization
,
Wiley
,
London
.
21.
Kwak
,
N.
, and
Chong-Ho
,
C.
, 2002, “
Input Feature Selection by Mutual Information Based on Parzen Window
,”
IEEE Trans. Pattern Anal. Mach. Intell.
0162-8828,
24
(
12
), pp.
1667
1671
.
22.
Battiti
,
R.
, 1994, “
Using Mutual Information for Selecting Features in Supervised Neuralnet Learning
,”
IEEE Trans. Neural Netw.
1045-9227,
5
(
4
), pp.
537
550
.
23.
Cerdeira
,
J.
,
Silva
,
P.
,
Cadima
,
J.
, and
Minhoto
,
M.
, 2005, ANNEAL, Subselect Package, the R project for statistical computing.
24.
Lumley
,
T.
, and
Miller
,
A.
, 2004, Regsubsets, Leaps Package, the R project for statistical computing.
25.
Neter
,
J.
,
Kutner
,
M. H.
,
Wasserman
,
W.
, and
Nachtsheim
,
C. J.
, 1996,
Applied Linear Statistical Models
,
McGraw-Hill
,
New York
.
You do not currently have access to this content.