A theoretical and experimental approach to the use of information theory in input space selection for modeling and diagnostic applications is examined. The assumptions and test cases used throughout the paper are specifically tailored to diesel engine diagnostic and modeling applications. This work seeks to quantify the amount of information about an output contained within an input space. The information theoretic quantity, conditional entropy, is shown to be an accurate predictor of model and diagnostic algorithm performance and therefore is a good choice for an input vector selection metric. Methods of estimating conditional entropy from collected data, including the amount of needed data, are also discussed.
Issue Section:
Technical Briefs
1.
Fukunaga
, K.
, 1990, Introduction to Statistical Pattern Recognition
, Academic
, New York
.2.
Baum
, E.
, and Haussler
, D.
, 1989, “What Size Net Gives Valid Generalization?
,” Neural Comput.
0899-7667, 1
(1
), pp. 151
–160
.3.
German
, S.
, Bienenstock
, E.
, and Doursat
, R.
, 1992, “Neural Networks and the Bias/Variance Dilemma
,” Neural Comput.
0899-7667, 4
(1
), pp. 1
–58
.4.
Battiti
, R.
, 1994, “Using Mutual Information for Selecting Features in Supervised Neural Net Learning
,” IEEE Trans. Neural Netw.
1045-9227, 5
(4
), pp. 537
–550
.5.
Utans
, J.
, Moody
, J.
, Rehfuss
, S.
, and Siegelmann
, H.
, 1995, “Input Variable Selection for Neural Networks: Application to Predicting the U. S. Business Cycle
,” IEEE/IAFE Conference on Computational Intelligence for Financial Engineering
, New York, NY.6.
MacKay
, J.
, 1995, “Neural Networks Summer School
,” University of Cambridge Programme for Industry
, Short Course, July 24–25.7.
Bonnlander
, B.
, 1996, “Nonparametric Selection of Input Variables for Connectionist Learning
,” Ph.D. Thesis, University of Colorado.8.
Yang
, H.
, and Moody
, J.
, 1999, “Feature Selection Based on Joint Mutual Information
,” Proceedings of International ICSC Symposium on Advances in Intelligent Data Analysis
, Rochester, NY
.9.
Deignan
, P.
, Meckl
, P.
, Franchek
, M.
, Jaliwala
, S.
, and Zhu
, G.
, 2000, “Input Vector Identification and System Model Construction by Average Mutual Information
,” Proceedings of the ASME Dynamic Systems and Control Division
, Vol. 1
, pp. 379
–382
.10.
Shannon
, C.
, 1948, “A Mathematical Theory of Communication
,” Bell Syst. Tech. J.
0005-8580, 27
, pp. 379
–423
, 623–653.11.
Cover
, T.
, and Thomas
, J.
, 1991, Elements of Information Theory
, Wiley
, New York
.12.
Parzen
, E.
, 1962, “On the Estimation of a Probability Density Function and Mode
,” Ann. Math. Stat.
0003-4851, 33
(3
), pp. 1065
–1076
.13.
Scott
, D.
, 1992, Multivariate Density Estimation Theory, Practice, and Visualization
, Wiley
, New York
.14.
Mansuripur
, M.
, 1986, Introduction to Information Theory
, Prentice Hall
, Englewood Cliffs, NJ
.15.
Beale
, M.
, 2001, MATLAB Neural Network Toolbox, Rev. 1.7, Rel. 4.0.1.16.
Kohonen
, T.
, Kangas
, J.
, Laaksonen
, J.
, and Torkkola
, K.
, 1992, “LVQ̱PAK: A Program Package for the Correct Application of Learning Vector Quantization Algorithms
,” Proceedings of the International Joint Conference on Neural Networks
, Baltimore, MD
, pp. 725
–730
.Copyright © 2007
by American Society of Mechanical Engineers
You do not currently have access to this content.