Abstract
Data has become an asset in this digitization revolution, and the healthcare segment is the leading cause of this big data. Healthcare data analysis is an influential and powerful source for developing new visions to upsurge attentiveness to well-being. Data in the healthcare sector consists of various symptoms, treatments, disease information, patient information, and, lastly, tests to detect diseases. Healthcare information, along with machine learning (ML) Algorithms, supports the examination of big data to identify and discover the hidden patterns in any condition which can be used to predict any disease. This paper proposes a decision-support framework for any disease prediction in the healthcare sector. This work proposed an Improvised ID3 Algorithm (Modified ID3) which is based on a simple model of decision tree algorithm (ID3) to reduce time complexity and complex computation through the application of arithmetic operations for entropy computation and obtaining information. The Modified ID3 algorithm is implemented in python programming by using a reduced feature set of the Hepatitis C virus dataset (Hoffmann et al. in J Lab Precis Med 3:58, 2018) along with standard ML algorithms, such as ID3, support vector machine, random forest, and other recent states of artwork. The proficiency of this work and other ML algorithms are tested via a confusion matrix for various assessment parameters.
Similar content being viewed by others
References
Akay A, Dragomir A, Erlandsson BE (2014) Network-based modeling and intelligent data mining of social media for improving care. IEEE J Biomed Health Inform 19(1):210–218
Arif F, Suryana N, Hussin B (2013) Cascade quality prediction method using multiple PCA+ ID3 for multi-stage manufacturing system. IERI Procedia 4:201–207
Baitharu TR, Pani SK (2016) Analysis of data mining techniques for healthcare decision support system using liver disorder dataset. Procedia Comput Sci 85:862–870
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
Capizzi G, Coco S, Sciuto GL, Napoli C, Hołubowski W (2019) An entropy evaluation algorithm to improve transmission efficiency of compressed data in pervasive healthcare mobile sensor networks. IEEE Access 8:4668–4678
Castaldo R, Melillo P, Izzo R, De Luca N, Pecchia L (2016) Fall prediction in hypertensive patients via short-term HRV analysis. IEEE J Biomed Health Inform 21(2):399–406
Chen M, Hao Y, Hwang K, Wang L, Wang L (2017) Disease prediction by machine learning over big data from healthcare communities. IEEE Access 5:8869–8879
Cheng YT, Lin YF, Chiang KH, Tseng VS (2017) Mining sequential risk patterns from large-scale clinical databases for early assessment of chronic diseases: a case study on chronic obstructive pulmonary disease. IEEE J Biomed Health Inform 21(2):303–311
Coleman JN, Chester EI, Softley CI, Kadlec J (2000) Arithmetic on the European logarithmic microprocessor. IEEE Trans Comput 49(7):702–715
De Mántaras RL (1991) A distance-based attribute selection measure for decision tree induction. Mach Learn 6(1):81–92
Elhadjamor EA, Ghannouchi SA (2019) Analyze in depth health care business process and key performance indicators using process mining. Procedia Comput Sci 164:610–617
Forkan ARM, Khalil I, Ibaida A, Tari Z (2015) BDCaM: big data for context-aware monitoring—a personalized knowledge discovery framework for assisted healthcare. IEEE Trans Cloud Comput 5(4):628–641
Forsberg D, Rosipko B, Sunshine JL (2016) Analyzing PACS usage patterns by means of process mining: steps toward a more detailed workflow analysis in radiology. J Digit Imaging 29(1):47–58
Hajihashemi Z, Popescu M (2015) A multidimensional time-series similarity measure with applications to eldercare monitoring. IEEE J Biomed Health Inform 20(3):953–962
Haq AU, Li JP, Khan J, Memon MH, Nazir S, Ahmad S, Ali A (2020) Intelligent machine learning approach for effective recognition of diabetes in E-healthcare using clinical data. Sensors 20(9):2649
Hoffmann G, Bietenbeck A, Lichtinghagen R, Klawonn F (2018) Using machine learning techniques to generate laboratory diagnostic pathways—a case study. J Lab Precis Med 3:58
Ismail WN, Hassan MM, Alsalamah HA (2018) Mining of productive periodic-frequent patterns for IoT data analytics. Futur Gener Comput Syst 88:512–523
Jain K, Kumar A (2020) An energy-efficient prediction model for data aggregation in sensor network. J Ambient Intell Humaniz Comput 11(11):5205–5216
Jain K, Kumar A (2021) ST-DAM: exploiting spatial and temporal correlation for energy-efficient data aggregation method in heterogeneous WSN. Int J Wirel Mob Comput 21(3):285–294
Jain K, Singh A (2021) An empirical cluster head selection and data aggregation scheme for a fault-tolerant sensor network. Int J Distrib Syst Technol (IJDST) 12(3):27–47
Jain K, Gupta M, Abraham A (2021) A review on privacy and security assessment of cloud computing. J Inf Assur Secur 16(5):161–168
Jain K, Singh A, Singh P, Yadav S (2022) An improved supervised classification algorithm in healthcare diagnostics for predicting opioid habit disorder. Int J Reliab Qual E-Healthc (IJRQEH) 11(1):1–16
Jin J, Sun W, Al-Turjman F, Khan MB, Yang X (2020) Activity pattern mining for healthcare. Ieee Access 8:56730–56738
Kibria MG, Nguyen K, Villardi GP, Zhao O, Ishizu K, Kojima F (2018) Big data analytics, machine learning, and artificial intelligence in next-generation wireless networks. IEEE Access 6:32328–32338
Kumar A, Srivastav AL, Dutt I, Bajaj K (2021) Classification of existing health model of india at the end of the twelfth plan using enhanced decision tree algorithm. Pertanika J Sci Technol. https://doi.org/10.47836/pjst.29.4.06
Leung CS, Wong KW, Sum PF, Chan LW (2001) A pruning method for the recursive least squared algorithm. Neural Netw 14(2):147–174
Li Y, Bai C, Reddy CK (2016) A distributed ensemble approach for mining healthcare data under privacy constraints. Inf Sci 330:245–259
Puppala M, He T, Chen S, Ogunti R, Yu X, Li F, Wong ST (2015) METEOR: an enterprise health informatics environment to support evidence-based medicine. IEEE Trans Biomed Eng 62(12):2776–2786
Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81–106
Raghuvanshi KK, Agarwal A, Jain K, Singh VB (2021) A generalized prediction model for improving software reliability using time-series modelling. Int J Syst Assur Eng Manag 13:1309
Raj RJS, Shobana SJ, Pustokhina IV, Pustokhin DA, Gupta D, Shankar K (2020) Optimal feature selection-based medical image classification using deep learning model in internet of medical things. IEEE Access 8:58006–58017
Rejab FB, Nouira K, Trabelsi A (2014) Health monitoring systems using machine learning techniques. Intelligent systems for science and information. Springer, Cham, pp 423–440
Rojas E, Munoz-Gama J, Sepúlveda M, Capurro D (2016) Process mining in healthcare: a literature review. J Biomed Inform 61:224–236
Shi Z, Zuo W, Liang S, Zuo X, Yue L, Li X (2020) IDDSAM: an integrated disease diagnosis and severity assessment model for intensive care units. IEEE Access 8:15423–15435
Suresh A, Udendhran R, Balamurgan M, Varatharajan R (2019) A novel internet of things framework integrated with real time monitoring for intelligent healthcare environment. J Med Syst 43(6):1–10
Xiong Y, Lu Y (2020) Deep feature extraction from the vocal vectors using sparse autoencoders for Parkinson’s classification. IEEE Access 8:27821–27830
Zhang Y (2012) Support vector machine classification algorithm and its application. In: International conference on information computing and applications. Springer, Berlin, Heidelberg, pp 179–186
Funding
This research received no specific grant from any funding agency.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no conflict of interest.
Human and animal participants
This work does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Agarwal, A., Jain, K. & Yadav, R.K. A mathematical model based on modified ID3 algorithm for healthcare diagnostics model. Int J Syst Assur Eng Manag 14, 2376–2386 (2023). https://doi.org/10.1007/s13198-023-02086-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13198-023-02086-w