Classifying component failures of a hybrid electric vehicle fleet based on load spectrum data

Bergmeir, Philipp; Nitsche, Christof; Nonnast, Jürgen; Bargende, Michael

doi:10.1007/s00521-015-2065-y

Classifying component failures of a hybrid electric vehicle fleet based on load spectrum data

Balanced random forest approaches employing uni- and multivariate decision trees

Predictive Analytics Using Machine Learning
Published: 01 October 2015

Volume 27, pages 2289–2304, (2016)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Philipp Bergmeir¹,
Christof Nitsche²,
Jürgen Nonnast¹ &
…
Michael Bargende³

635 Accesses
4 Citations
Explore all metrics

Abstract

Component failures in hybrid electric vehicles (HEV) can cause high warranty costs for car manufacturers. Hence, in order to (1) predict whether a component of the hybrid power-train of a HEV is faulty, and (2) to identify loads related to component failures, we train several random forest variants on so-called load spectrum data, i.e., the state-of-the-art data employed for calculating the fatigue life of components in fatigue analysis. We propose a parameter tuning framework that enables the studied random forest models, formed by univariate and multivariate decision trees, respectively, to handle the class imbalance problem of our dataset and to select only a small number of relevant variables in order to improve classification performance and to identify failure-related variables. By achieving an average balanced accuracy value of 85.2 %, while reducing the number of variables used from 590 to 22 variables, our results for failures of the hybrid car battery (approx. 200 faulty, 7000 non-faulty vehicles) demonstrate that especially balanced random forests using univariate decision trees achieve promising classification results on load spectrum data. Moreover, the selected variables can be related to component failures of the hybrid power-train.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On the use of machine learning methods to predict component reliability from data-driven industrial case studies

Article 11 September 2017

Identifying maximum imbalance in datasets for fault diagnosis of gearboxes

Article 15 June 2015

A hybrid LSTM random forest model with grey wolf optimization for enhanced detection of multiple bearing faults

Article Open access 14 October 2024

Discover the latest articles, news and stories from top researchers in related subjects.

References

Breiman L (2001) Random forests. Mach Learn 45(1):5–32. doi:10.1023/A:1010933404324
Article MathSciNet MATH Google Scholar
Breiman L, Friedman J, Olshen R, Stone C (1984) Classification and regression trees. Chapman and Hall/CRC, New York
MATH Google Scholar
Brodley C, Utgoff P (1995) Multivariate decision trees. Mach Learn 19(1):45–77. doi:10.1023/A:1022607123649
MATH Google Scholar
Köhler M, Jenne S, Pötter K, Zenner H (2012) Zählverfahren und Lastannahme in der Betriebsfestigkeit. Springer, Berlin
Book Google Scholar
López V, Fernández A, García S, Palade V, Herrera F (2013) An insight into classification with imbalanced data: empirical results and current trends on using data intrinsic characteristics. Inf Sci 250:113–141. doi:10.1016/j.ins.2013.07.007
Article Google Scholar
Genuer R, Poggi JM, Tuleau-Malot C (2010) Variable selection using random forests. Pattern Recogn Lett 31(14):2225–2236. doi:10.1016/j.patrec.2010.03.014
Article Google Scholar
Breiman L, Cutler A (2015) Random forests-classification description. Department of Statistics Homepage. http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm. Accessed 15 Jan 2015
Bergmeir P, Nitsche C, Nonnast J, Bargende M, Antony P, Keller U (2014) Klassifikationsverfahren zur Identifikation von Korrelationen zwischen Antriebsstrangbelastungen und Hybridkomponentenfehlern einer Hybridfahrzeugflotte. Technical report, Universität Stuttgart
Vapnik VN (1995) The nature of statistical learning theory. Springer, New York
Book MATH Google Scholar
Bergmeir P, Nitsche C, Nonnast J, Bargende M, Antony P, Keller U (2014) Using balanced random forests on load spectrum data for classifying component failures of a hybrid electric vehicle fleet. In: 13th international conference on machine learning and applications (ICMLA 2014), pp 397–404. doi:10.1109/ICMLA.2014.71
Gusikhin O, Rychtyckyj N, Filev D (2007) Intelligent systems in the automotive industry: applications and trends. Knowl Inf Syst 12(2):147–168
Article Google Scholar
Buddhakulsomsiri J, Zakarian A (2009) Sequential pattern mining algorithm for automotive warranty data. Comput Ind Eng 57(1):137–147. doi:10.1016/j.cie.2008.11.006
Article Google Scholar
Frisk E, Krysander M, Larsson E (2014) Data-driven lead-acid battery prognostics using random survival forests. In: Proceedings of the 2nd European conference of the PHM society (PHME14)
Prytz R, Nowaczyk S, Rgnvaldsson T, Byttner S (2015) Predicting the need for vehicle compressor repairs using maintenance records and logged vehicle data. Eng Appl Artif Intell 41:139–150. doi:10.1016/j.engappai.2015.02.009
Article Google Scholar
Lee Y, Pan J, Hathaway R, Barkey M (2011) Fatigue testing and analysis: theory and practice. Elsevier Science, Amsterdam
Google Scholar
Kondo Y (2003) 4.10-fatigue under variable amplitude loading. In: Karihaloo IMR (ed) Comprehensive structural integrity. Pergamon, Oxford, pp 253–279
Chapter Google Scholar
Saha B, Goebel K (2007) Battery data set. NASA Ames Prognostics Data Repository. http://ti.arc.nasa.gov/tech/dash/pcoe/prognostic-data-repository/#battery. Accessed 12 Jan 2015
Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Francisco
Google Scholar
del Río S, López V, Benítez JM, Herrera F (2014) On the use of MapReduce for imbalanced big data using random forest. Inf Sci 285:112–137. doi:10.1016/j.ins.2014.03.043
Article Google Scholar
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction, 2nd edn., Springer Series in StatisticsSpringer, SpringerBerlin
Book MATH Google Scholar
Liaw A, Wiener M (2002) Classification and Regression by randomForest. R News 2(3):18–22. http://CRAN.R-project.org/doc/Rnews/
Schneider M, Hirsch S, Weber B, Székely G, Menze BH (2015) Joint 3-D vessel segmentation and centerline extraction using oblique Hough forests with steerable filters. Med Image Anal 19(1):220–249. doi:10.1016/j.media.2014.09.007
Article Google Scholar
Menze BH, Kelm BM, Splitthoff DN, Koethe U, Hamprecht FA (2011) On oblique random forests. In: Gunopulos D, Hofmann T, Malerba D, Vazirgiannis M (eds) Machine learning and knowledge discovery in databases. Springer, Berlin, pp 453–469
Barros R, Cerri R, Jaskowiak P, de Carvalho A (2011) A bottom-up oblique decision tree induction algorithm. In: 11th international conference on intelligent systems design and applications (ISDA 2011), pp 450–456. doi:10.1109/ISDA.2011.6121697
Murthy SK, Kasif S, Salzberg S (1994) A system for induction of oblique decision trees. J Artif Intell Res 2(1):1–32
MATH Google Scholar
Parfionovas A (2013) Enhancement of random forests using trees with oblique splits. Dissertation, Utah State University. http://digitalcommons.usu.edu/etd/1508. Accessed 07 Jan 2015
Friedman JH, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat. Softw 33(1):1–22. http://www.jstatsoft.org/v33/i01
Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B 67:301–320. doi:10.1111/j.1467-9868.2005.00503.x
Article MathSciNet MATH Google Scholar
Hoerl AE, Kennard RW (2000) Ridge regression: biased estimation for nonorthogonal problems. Technometrics 42(1):80–86. doi:10.1080/00401706.2000.10485983
Article MathSciNet MATH Google Scholar
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B 58:267–288. doi:10.1111/j.1467-9868.2011.00771.x
MathSciNet MATH Google Scholar
Truong AKY (2009) Fast growing and interpretable oblique trees via logistic regression models. Dissertation, University of Oxford. http://ora.ox.ac.uk/objects/uuid:e0de0156-da01-4781-85c5-8213f5004f10. Accessed 25 Jan 2015
Martens H (2001) Reliable and relevant modelling of real world data: a personal account of the development of PLS regression. Chemom Intell Lab Syst 58(2):85–95. doi:10.1016/S0169-7439(01)00153-8
Article MathSciNet Google Scholar
Wold S (2001) Personal memories of the early PLS development. Chemom Intell Lab Syst 58(2):83–84. doi:10.1016/S0169-7439(01)00152-6
Article MathSciNet Google Scholar
Mevik BH, Wehrens R (2007) The pls package: principal component and partial least squares regression in R. J Stat Softw 18(2):1–24. http://www.jstatsoft.org/v18/i02
Dayal BS, MacGregor JF (1997) Improved PLS algorithms. J Chemom 11(1):73–85
Article Google Scholar
Do TN, Lenca P, Lallich S, Pham NK (2010) Classifying very-high-dimensional data with random forests of oblique decision trees. In: Guillet F, Ritschard G, Zighed D, Briand H (eds) Advances in knowledge discovery and management, studies in computational intelligence, vol 292. Springer, Berlin, pp 39–55. doi:10.1007/978-3-642-00580-0_3
Fung G, Mangasarian OL (2001) Proximal support vector machine classifiers. In: Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining, ACM, New York, NY, USA, KDD ’01, pp 77–86. doi:10.1145/502512.502527
Chen C, Liaw A, Breiman L (2004) Using random forest to learn imbalanced data. Technical report, Department of Statistics, University of Berkeley. http://www.stat.berkeley.edu/users/chenchao/666.pdf. Accessed 29 Dec 2014
Díaz-Uriarte R, de Alvarez Andrés S (2006) Gene selection and classification of microarray data using random forest. BMC Bioinform 7(1):1–13. doi:10.1186/1471-2105-7-3
Article Google Scholar
Menze B, Splitthoff N (2012) obliqueRF: oblique random forests from recursive linear model splits. http://CRAN.R-project.org/package=obliqueRF. R package version 0.3
Kuhn M (2008) Building predictive models in r using the caret package. J Stat Softw 28(5):1–26. http://www.jstatsoft.org/v28/i05
Kuhn M, Wing J, Weston S, Williams A, Keefer C, Engelhardt A, Cooper T, Mayer Z, the R Core Team (2014) caret: Classification and regression training. http://CRAN.R-project.org/package=caret. R package version 6.0-24
Brodersen K, Ong CS, Stephan K, Buhmann J (2010) The balanced accuracy and its posterior distribution. In: 20th international conference on pattern recognition (ICPR 2010), pp 3121–3124. doi:10.1109/ICPR.2010.764
Dahinden C (2006) Classification with tree-based ensembles applied to the WCCI 2006 performance prediction challenge datasets. In: International joint conference on neural networks (IJCNN ’06), pp 1669–1672. doi:10.1109/IJCNN.2006.246635
Kuhn M, Johnson K (2013) Applied predictive modeling. SpringerLink: Bücher, Springer. http://books.google.de/books?id=xYRDAAAAQBAJ
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30. http://dl.acm.org/citation.cfm?id=1248547.1248548
García S, Fernández A, Luengo J, Herrera F (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power. Inf Sci 180(10):2044–2064. doi:10.1016/j.ins.2009.12.010
Article Google Scholar
Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32(200):675–701. http://www.jstor.org/stable/2279372
Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11(1):86–92. http://www.jstor.org/stable/2235971
Hochberg Y (1988) A sharper Bonferroni procedure for multiple tests of significance. Biometrika 75(4):800–802. doi:10.1093/biomet/75.4.800
Article MathSciNet MATH Google Scholar
Wilcoxon F (1945) Individual comparisons by ranking methods. Biom Bull 1(6):80–83
Article Google Scholar
Irsoy O, Yildiz OT, Alpaydin E (2012) Design and analysis of classifier learning experiments in bioinformatics: survey and case studies. IEEE/ACM Trans Comput Biol Bioinform 9(6):1663–1675
Article Google Scholar
Pizarro J, Guerrero E, Galindo PL (2002) Multiple comparison procedures applied to model selection. Neurocomputing 48(1–4):155–173. doi:10.1016/S0925-2312(01)00653-1
Article MATH Google Scholar
Herb F (2010) Alterungsmechanismen in Lithium-Ionen-Batterien und PEM-Brennstoffzellen und deren Einfluss auf die Eigenschaften von daraus bestehenden hybrid-systemen. Dissertation, University Ulm, Faculty of Natural Sciences. http://vts.uni-ulm.de/doc.asp?id=7404. Accessed 04 Jan 2015

Download references

Acknowledgments

P. Bergmeir participates in the doctoral program “Promotionskolleg HYBRID”, funded by the Ministry for Science, Research and Arts Baden-Württemberg, Germany. For computational resources, the authors acknowledge the bwGRiD (http://www.bw-grid.de), member of the German D-Grid initiative, funded by the Ministry for Education and Research and the Ministry for Science, Research and Arts Baden-Württemberg, Germany.

Conflict of interest

The authors declare that they have no conflict of interest.

Author information

Authors and Affiliations

Department of Information Technology, Esslingen University of Applied Sciences, Flandernstr. 101, 73732, Esslingen, Germany
Philipp Bergmeir & Jürgen Nonnast
Daimler AG, Mercedes-Benz Sindelfingen Plant, Bela-Barenyi-Str. 14, 71063, Sindelfingen, Germany
Christof Nitsche
Institute for Internal Combustion Engines and Automotive Engineering, University of Stuttgart, Pfaffenwaldring 12, 70569, Stuttgart, Germany
Michael Bargende

Authors

Philipp Bergmeir
View author publications
You can also search for this author in PubMed Google Scholar
Christof Nitsche
View author publications
You can also search for this author in PubMed Google Scholar
Jürgen Nonnast
View author publications
You can also search for this author in PubMed Google Scholar
Michael Bargende
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Philipp Bergmeir.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bergmeir, P., Nitsche, C., Nonnast, J. et al. Classifying component failures of a hybrid electric vehicle fleet based on load spectrum data. Neural Comput & Applic 27, 2289–2304 (2016). https://doi.org/10.1007/s00521-015-2065-y

Download citation

Received: 27 February 2015
Accepted: 08 September 2015
Published: 01 October 2015
Issue Date: November 2016
DOI: https://doi.org/10.1007/s00521-015-2065-y

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Classifying component failures of a hybrid electric vehicle fleet based on load spectrum data

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

On the use of machine learning methods to predict component reliability from data-driven industrial case studies

Identifying maximum imbalance in datasets for fault diagnosis of gearboxes

A hybrid LSTM random forest model with grey wolf optimization for enhanced detection of multiple bearing faults

References

Acknowledgments

Conflict of interest

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Classifying component failures of a hybrid electric vehicle fleet based on load spectrum data

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

On the use of machine learning methods to predict component reliability from data-driven industrial case studies

Identifying maximum imbalance in datasets for fault diagnosis of gearboxes

A hybrid LSTM random forest model with grey wolf optimization for enhanced detection of multiple bearing faults

Explore related subjects

References

Acknowledgments

Conflict of interest

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation