Abstract
Naive Bayes (NB) is one of the top ten machine learning algorithms whereas its attribute independence assumption rarely holds in practice. A feasible and efficient approach to improving NB is relaxing the assumption by adding augmented edges to the restricted topology of NB. In this paper we prove theoretically that the generalized topology may be a suboptimal solution to model multivariate probability distributions if its fitness to data cannot be measured. Thus we propose to apply log-likelihood function as the scoring function, then introduce an efficient heuristic search strategy to explore high-dependence relationships, and for each iteration the learned topology will be improved to fit data better. The proposed algorithm, called log-likelihood Bayesian classifier (LLBC), can respectively learn two submodels from labeled training set and individual unlabeled testing instance, and then make them work jointly for classification in the framework of ensemble learning. Our extensive experimental evaluations on 36 benchmark datasets from the University of California at Irvine (UCI) machine learning repository reveal that, LLBC demonstrates excellent classification performance and provides a competitive approach to learn from labeled and unlabeled data.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data Availability and Access
The data that support the findings of this study are available from the corresponding author, upon reasonable request.
Notes
The source code of LLBC can be found in https://github.com/Wangjj1129/LLBC
The used datasets can be found in https://archive.ics.uci.edu/ml/datasets.html.
References
Pang Y, Zhao X, Yan H, Liu Y (2021) Data-driven trajectory prediction with weather uncertainties: A Bayesian deep learning approach. Transportation Research Part C-Emerging Technologies. 130:103326
Kong H, Shi X, Wang L, Liu Y, Mammadov M, Wang G (2021) Averaged tree-augmented one-dependence estimators. Appl Intell 51(7):4270–4286
Chen Z, Jiang L, Li C (2022) Label augmented and weighted majority voting for crowdsourcing. Inf Sci 606:397–409
Wang L, Zhang S, Mammadov M, Li K, Zhang X, Wu S (2021) Semi-supervised weighting for averaged one-dependence estimators. Appl Intell 52(4):4057–4073
Zhao X, Yan H, Hu Z, Du D (2022) Deep spatio-temporal sparse decomposition for trend prediction and anomaly detection in cardiac electrical conduction. IISE Transactions on Healthcare Systems Engineering. 12(2):150–164
Chickering DM, Heckerman D, Meek C (2004) Large-sample learning of Bayesian networks is NP-hard. J Mach Learn Res 5:1287–1330
Zhang H, Jiang L, Zhang W, Li C (2023) Multi-view Attribute Weighted Naive Bayes. IEEE Trans Knowl Data Eng 35(7):7291–7302
Ren Y, Wang L, Li X, Pang M, Wei J (2022) Stochastic optimization for bayesian network classifiers. Appl Intell 52(13):15496–15516
Martinez AM, Webb GI, Chen S, Zaidi NA (2016) Scalable learning of Bayesian network classifiers. J Mach Learn Res 17(1):1515–1549
Chen S, Zhang Z, Liu L (2021) Attribute Selecting in Tree-Augmented Naive Bayes by Cross Validation Risk Minimization. Mathematics. 9(20):2564
Sahami M (1996) Learning limited dependence Bayesian classifiers, In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining. pp. 335–338
Friedman N, Geiger D, Goldszmidt M (1997) Bayesian network classifiers. Mach Learn 29(2–3):131–163
Friedman JH, Kohavi R, Yun Y (1996) Lazy decision trees, In: Proceedings of the 13th National Conference on Artificial Intelligence. pp. 717–724
Gregory FC, Edward H (1992) A Bayesian Method for the Induction of Probabilistic Networks from Data. Mach Learn 9:309–347
David H, Dan G, David MC (1995) Learning Bayesian Networks: The Combination of Knowledge and Statistical Data. Mach Learn 20(3):197–243
Zhao X, lquebal A, Sun H, Yan H (2020) Simultaneous material microstructure classification and discovery via hidden Markov modeling of acoustic emission signals, In: 15th ASME International Manufacturing Science and Engineering Conference (MSEC). V002T07A035
Silander T, Roos T, Kontkanen P, Myllymäki P (2008) Factorized Normalized Maximum Likelihood Criterion for Learning Bayesian Network Structures, In: Proceedings of the 4th European Workshop on Probabilistic Graphical Models. pp. 257–272
Wang L, Li L, Li Q, Li K (2024) Learning high-dependence Bayesian network classifier with robust topology. Expert Syst Appl 239:122395
Zhang H, Jiang L, Yu L (2021) Attribute and instance weighted naive Bayes. Pattern Recogn 111:107674
Jiang L, Zhang L, Li C, Wu J (2019) A Correlation-based Feature Weighting Filter for Naive Bayes. IEEE Trans Knowl Data Eng 31(2):201–213
Pang Y, Zhao X, Hu J, Yan H, Liu Y (2022) Bayesian Spatio-Temporal grAph tRansformer network (B-STAR) for multi-aircraft trajectory prediction. Knowl-Based Syst 249:108998
Parag KV, Donnelly CA (2020) Adaptive Estimation for Epidemic Renewal and Phylogenetic Skyline Models. Syst Biol 69(6):1163–1179
de Campos LM (2006) A scoring function for learning Bayesian networks based on mutual information and conditional independence tests. J Mach Learn Res 7:2149–2187
Jiang L, Cai Z, Wang D, Zhang H (2012) Improving Tree augmented Naive Bayes for class probability estimation. Knowl-Based Syst 26:239–245
Zhao X, Yan H, Liu Y (2021) Hierarchical tree-based sequential event prediction with application in the aviation accident report, In: IEEE 37th International Conference on Data Engineering(ICDE). pp.1925–1930
Jiang L, Zhang L, Yu L, Wang D (2019) Class-specific attribute weighted naive Bayes. Pattern Recogn 88:321–330
Madden MG (2009) On the classification performance of TAN and general Bayesian networks. Knowl-Based Syst 22(7):489– 495
Pernkopf F, O’Leary P (2003) Floating search algorithm for structure learning of Bayesian network classifiers. Pattern Recgnition Letters. 24(15):2839–2848
de Campos CP, Corani G, Scanagatta M, Cuccu M, Zaffalon M (2016) Learning extended tree augmented naive structures. Int J Approximate Reasoning 68:153–163
Kong H, Wang L (2023) Flexible model weighting for one-dependence estimators based on point-wise independence analysis. Pattern Recogn 139:109473
Breiman L (1996) Bagging Predictors. Mach Learn 24:123–140
Verma B, Rahman A (2012) Cluster-Oriented Ensemble Classifier: Impact of Multicluster Characterization on Ensemble Classifier Learning. IEEE Trans Knowl Data Eng 24(4):605–618
Schapire RE (1990) The Strength of Weak Learnability. Mach Learn 5(2):197–227
Webb GI, Boughton JR, Wang Z (2005) Not so naive Bayes: Aggregating one-dependence estimators. Mach Learn 58:5–24
Jiang L, Zhang H, Cai Z, Wang D (2012) Weighted average of one-dependence estimators. Journal of Experimental & Theoretical Artificial Intelligence. 24(2):219–230
Fayyad U, Irani K (1993) Multi-interval discretization of continuous-valued attributes for classification learning, In: Proceedings of the 13th International Joint Conference on Artificial Intelligence. pp. 1022–1027
Gigerenzer G, Brighton H (2009) Homo Heuristicus: Why Biased Minds Make Better Inferences. Top Cogn Sci 1(1):107–143
Thomas GD, Richard HL, Tomás L (1997) Solving the multiple instance problem with axis-parallel rectangles. Artif Intell 89(1–2):31–71
Wang L, Xie Y, Pang M, Wei J (2022) Alleviating the attribute conditional independence and I.I.D. assumptions of averaged one-dependence estimator by double weighting, Knowledge-Based Systems. 250:109078
Frank E, Mark H, Bernhard P (2003) Locally weighted naive bayes, In: Proceedings of the Conference on Uncertainty in Artificial Intelligence. pp. 249–256
Bhattacharjee K, Pant M, Zhang YD, Satapathy SC (2020) Multiple Instance Learning with Genetic Pooling for medical data analysis. Pattern Recogn Lett 133:247–255
Park SH, Fuernkranz J (2014) Efficient implementation of class-based decomposition schemes for naive bayes. Mach Learn 96(3):295–309
Zadrozny B, Elkan C (2001) Obtaining calibrated probability estimates from decision trees and naive bayesian classifiers, In: Proceedings of the 18th International Conference on Machine Learning. pp.609–616
Wang L, Zhou J, Wei J, Pang M, Sun M (2022) Learning causal Bayesian networks based on causality analysis for classification. Eng Appl Artif Intell 114:105212
Duan Z, Wang L, Chen S, Sun M (2020) Instance-based weighting filter for superparent one-dependence estimators. Knowl-Based Syst 203:106085
Kohavi R, Wolpert DH (1996) Bias plus variance decomposition for zero-one loss functions, In: Proceedings of the 13th International Conference on Machine Learning. pp. 275–283
Hyndman RJ, Koehler AB (2006) Another look at measures of forecast accuracy. International Journal of Forecast. 22(4):679–688
Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32(200):675–701
Demšar J (2006) Statistical comparisons of classifiers over multiple datasets. J Mach Learn Res 7(1):1–30
Luca M, Barlacchi G, Lepri B, Pappalardo L (2022) A Survey on Deep Learning for Human Mobility. ACM Comput Surv 55(1):7
Brauwers G, Frasincar F (2023) A General Survey on Attention Mechanisms in Deep Learning. IEEE Trans Knowl Data Eng 35(4):3279–3298
Han C, Pan S, Que W, Wang Z, Zhai Y, Shi L (2022) Automated localization and severity period prediction of myocardial infarction with clinical interpretability based on deep learning and knowledge graph. Expert Syst Appl 209:118398
Tamasauskaite G, Groth P (2023) Defining a Knowledge Graph Development Process Through a Systematic Review. ACM Transactions on Software Engineering and Methodology. 32(1):27
Acknowledgements
This work is supported by the National Key Research and Development Program of China (No. 2019YFC1804804), Open Research Project of the Hubei Key Laboratory of Intelligent Geo-Information Processing (No. KLIGIP-2021A04), and the Scientific and Technological Developing Scheme of Jilin Province (No. 20200201281JC).
Author information
Authors and Affiliations
Contributions
Limin Wang: Methodology, Supervision, Writing-review & editing, Funding acquisition. Junjie Wang: Conceptualization, Validation, Visualization, Writing-original draft. Lu Guo: Formal analysis, Project administration. Qilong Li: Software, Investigation.
Corresponding author
Ethics declarations
competing interests
The authors declare that they have no conflict of interest.
Ethical and informed consent for data used
This study does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, L., Wang, J., Guo, L. et al. Efficient heuristics for learning scalable Bayesian network classifier from labeled and unlabeled data. Appl Intell 54, 1957–1979 (2024). https://doi.org/10.1007/s10489-023-05242-8
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-023-05242-8