The weight decay backpropagation for generalizations with missing values

Gupta, Amit; Lam, Monica

doi:10.1023/A:1018945915940

The weight decay backpropagation for generalizations with missing values

Published: January 1998

Volume 78, pages 165–187, (1998)
Cite this article

Annals of Operations Research Aims and scope Submit manuscript

Amit Gupta &
Monica Lam

230 Accesses
17 Citations
Explore all metrics

Abstract

The purpose of this study is to investigate the generalization power of a modified backpropagation training algorithm referred to as "weight decay". In particular, we focus on the effect of the weight decay method on data sets with missing values. Three data sets with real missing values and three data sets with missing values created by randomly deleting attribute values are adopted as the test bank in this study. We first reconstruct missing values using four different methods, viz., standard backpropagation, iterative multiple regression, replacing by average, and replacing by zero. Then the standard backpropagation and the weight decay backpropagation are used to train networks for classification predictions. Experimental results show that the weight decay backpropagation can at least achieve a performance equivalent to the standard backpropagation. In addition, there is evidence that the standard backpropagation is a viable tool to reconstruct missing values. Experimental results also show that in the same data set, the higher the percentage of missing values, the higher the differential effects from reconstruction methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

M.L. Beale and R.J.A. Little, Missing values in multivariate analysis, Journal of the Royal Statistical Society B 37(1975)129 - 45.
Google Scholar
A. Blumer, A. Ehrenfeucht, M.K. Warmuth and D. Haussler, Occam's razor, Information Processing Letters 24(1987)377 - 380.
Google Scholar
S.F. Buck, A method of estimation of missing values in multivariate data suitable for use with an electronic computer, Journal of the Royal Statistical Society B 22(1960)302 - 306.
Google Scholar
K. Chakraborty, K. Mehrotra, C.K. Mohan and S. Ranka, Forecasting the behavior of multivariate time series using neural networks, Neural Networks 5(1992)961 - 970.
Google Scholar
Y. Chauvin, A back-propagation algorithm with optimal use of hidden units, in: Advances in Neural Information Processing Systems I D.S. Touretzky, ed., Morgan Kaufmann, San Mateo, 1988, pp. 519 - 526.
Google Scholar
F.L. Chung and T. Lee, A node pruning algorithm for backpropagation networks, International Journal of Neural Systems 3(1992)301 - 314.
Google Scholar
E. Collins, S. Ghosh and C. Scofield, An application of a multiple neural-network learning system to emulation of mortgage underwriting judgments, in: Proceedings of the IEEE International Conference on Neural Networks Vol. 2, 1988, pp. 459- 466.
Google Scholar
G. Cybenko, Approximation by superpositions of a sigmoidal function, Mathematics of Control, Signals, and Systems 2(1989)303 - 314.
Google Scholar
S. Dutta and S. Shekhar, Bond-rating: A non-conservative application of neural networks, in: Proceedings of the IEEE International Conference on Neural Networks Vol. 2, 1988, pp. 443- 450.
Google Scholar
S.E. Fahlman and C. Lebiere, The cascade-correlation learning architecture, in: Advances in Neural Information Processing Systems II D.S. Touretzky, ed., Morgan Kaufmann, San Mate, 1990, pp. 524 - 532.
Google Scholar
A. Gupta and M. Lam, A neural net approach for estimating missing values in multivariate analysis, in: Proceedings of the 1993 National Meeting of the Decision Sciences Institute Vol. 2, Washington, DC, 1993, pp. 708 - 710.
Google Scholar
S.J. Hanson and L. Pratt, A comparison of different biases for minimal network construction with back-propagation, in: Advances in Neural Information Processing Systems I D.S. Touretzky, ed., Morgan Kaufmann, San Mateo, 1988, pp. 177 - 185.
Google Scholar
J. Hertz, A. Krogh and R. Palmer, Introduction to the Theory of Neural Computation Addison-Wesley, Redwood City, 1991.
Google Scholar
G.E. Hinton, Learning distributed representations of concepts, in: Proceedings of the 8th Annual Conference of the Cognitive Science Society Amherst, 1986, pp. 1 - 12.
K. Hornik, M. Stinchcombe and H. White, Multilayer feedforward networks are universal approximators, Neural Networks 2(1989)359 - 366.
Google Scholar
J.R. Jang, Self-learning fuzzy controllers based on temporal backpropagation, IEEE Transactions on Neural Networks 3(1992)714-723.
Google Scholar
A.H. Kramer and A. Sangiovanni-Vincentelli, Efficient parallel learning algorithms for neural networks, in: Advances in Neural Information Processing Systems I D.S. Touretzky, ed., Morgan Kaufmann, San Mateo, 1988, pp. 40- 48.
Google Scholar
W.J. Krzanowski, Principles of Multivariate Analysis Oxford Science Publications, New York, 1990.
Google Scholar
Y. Le Cun, B. Boser, J.S. Denker, D. Henderson, R.E. Hubbard and L.D. Jackel, Handwritten digit recognition with a backpropagation network, in: Advances in Neural Information Processing Systems II D.S. Touretzky, ed., Morgan Kaufmann, San Mateo, 1990, pp. 396- 404.
Google Scholar
G.L. Martin and J.A. Pittmann, Recognizing hand-printed letters and digits, in: Advances in Neural Information Processing Systems II D.S. Touretzky, ed., Morgan Kaufmann, San Mateo, 1990, pp. 405- 414.
Google Scholar
T.M. Mitchell, The need for biases in learning generalization, in: Readings in Machine Learning J.W. Shavlik and T.G. Dietterich, eds., Morgan Kaufmann, San Mateo, 1990, pp. 184 - 191.
Google Scholar
M.C. Mozer and P. Smolensky, Skeletonization: A technique for trimming the fat from a network via relevance assessment, in: Advances in Neural Information Processing Systems I D.S. Touretzky, ed., Morgan Kaufmann, San Mateo, 1990, pp. 107 - 115.
Google Scholar
P.M. Murphy and D.W. Aha, UCI Repository of Machine Learning Databases University of California-Irvine, Department of Information and Computer Science, 1992.
J.A. Ou, The information content of nonearnings accounting numbers as earnings predictors, Journal of Accounting Research 28(1990)144 - 163.
Google Scholar
E.P. Patuwo, M.Y. Hu and M.S. Hung, Two-group classification using neural networks, Decision Sciences 24(1993)825 - 845.
Google Scholar
J.R. Quinlan and R.L. Rivest, Inferring decision trees using the minimum description length principle, Information and Computation 80(1989)227 - 248.
Google Scholar
D.E. Rumelhart, G. Hinton and R. Williams, Learning internal representation by error propagation, in: Parallel Distributed Processing D.E. Rumelhart and J. McClelland, eds., MIT Press, Cambridge MA, 1986, pp. 318 - 362.
Google Scholar
L.M. Salchenberger, E.M. Cinar and N.A. Lash, Neural networks: A new tool for predicting thrift failures, Decision Sciences 23(1992)899 - 916.
Google Scholar
R. Scalettar and A. Zee, Emergence of grandmother memory in feedforward networks: Learning with noise and forgetfulness, in: Connectionist Models and Their Implications: Readings from Cognitive Science D. Waltz and J.A. Feldman, eds., Ablex, Norwood, 1988, pp. 309 - 332.
Google Scholar
T.J. Sejnowski, B.P. Yuhas, M.H. Goldstein, Jr. and R.E. Jenkino, Combining visual and acoustic speech signals with a neural network improves intelligibility, in: Advances in Neural Information Processing Systems II D.S. Touretzky, ed., Morgan Kaufmann, San Mateo, 1990, pp. 232-239.
Google Scholar
J. Sietsma and R.J.F. Dow, Neural net pruning - why and how, in: IEEE International Conference on Neural Networks I IEEE, New York, 1988, pp. 325 - 333.
Google Scholar
K.Y. Tam and M.L. Kiang, Managerial applications of neural networks: The case of bank failure predictions, Management Science 38(1992)926 - 947.
Google Scholar
A.S. Weigend, D.E. Rumelhart and B.A. Huberman, Generalization by weight elimination with application to forecasting, in: Advances in Neural Information Processing Systems III R.P. Lippmann, J.E. Moody and D.S. Touretzky, eds., Morgan Kaufmann, San Mateo, 1991, pp. 875- 882.
Google Scholar
Y. Yoon, G. Swales, Jr. and T.M. Margavio, A comparison of discriminant analysis versus artificial neural networks, Journal of Operational Research Society 44(1993)51 - 60.
Google Scholar

Download references

Authors

Amit Gupta
View author publications
You can also search for this author in PubMed Google Scholar
Monica Lam
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gupta, A., Lam, M. The weight decay backpropagation for generalizations with missing values. Annals of Operations Research 78, 165–187 (1998). https://doi.org/10.1023/A:1018945915940

Download citation

Issue Date: January 1998
DOI: https://doi.org/10.1023/A:1018945915940

neural networks, weight decay, backpropagation

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The weight decay backpropagation for generalizations with missing values

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Missing Features Reconstruction and Its Impact on Classification Accuracy

Empirical comparison of supervised learning techniques for missing value imputation

Two directional Laplacian pyramids with application to data imputation

References

Rights and permissions

About this article

Cite this article

Subscribe and save

Buy Now

Navigation

The weight decay backpropagation for generalizations with missing values

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Missing Features Reconstruction and Its Impact on Classification Accuracy

Empirical comparison of supervised learning techniques for missing value imputation

Two directional Laplacian pyramids with application to data imputation

References

Rights and permissions

About this article

Cite this article

Share this article

Subscribe and save

Buy Now

Search

Navigation