Abstract
As the technology in automation and computation advances, traffic data can be easily collected from multiple sources, such as sensors and surveillance cameras. To extract value from the huge volumes of available data requires the capability to process and extract patterns in large datasets. In this paper, a machine learning method embedded within a big data analytics platform is constructed by using random forests method and Apache Hadoop to predict highway travel time based on data collected from highway electronic toll collection in Taiwan. Various prediction models are then developed for highway travel time based on historical and real-time data to provide drivers with estimated and adjusted travel time information.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Breiman L (2001a) Bagging predictors. Manuf Neth Mach Learn 24:123–140
Breiman L (2001b) Random forests. Manuf Neth Mach Learn 45:5–32
Chen M, Mao S, Liu Y (2014) Big data: a survey. Mob Netw Appl 19:171–209
Chen FH, Howard H (2016) An alternative model for the analysis of detecting electronic industries earnings management using stepwise regression, random forest, and decision tree. Soft Comput 20:1945–1960
Chien SI-J, Kuchipudi CM (2003) Dynamic travel time prediction with real-time and historic data. J Transp Eng 129(6):608–616
Cunha J, Silva C, Antunes M (2015) Health Twitter Big Bata Management with Hadoop Framework. Proc Comput Sci 64:425–431
Dean J, Ghemawat S (2008) MapReduce: simplified data processing on large clusters. Commun ACM 51(1):107–113
Fei X, Lu C-C, Lui K (2011) A bayesian dynamic linear model approach for real-time short-term freeway travel time prediction. Transp Res Part C 19:1306–1318
Gal G, Mandelbaum A, Schnitzler F, Senderovich A, Weidlich M (2017) Traveling time prediction in scheduled transportation with journey segments. Inf Syst 64:266–280
Greenhalgh J, Mirmehdi M (2012) Traffic sign recognition using MSER and random forests. In: Proceedings of the \(20{\rm th}\) European signal processing conference
Harris JR, Grunsky EC (2015) Predictive lithological mapping of Canada’s north using random forest classification applied to geophysical and geochemical data. Comput Geosci 80:9–25
Innamaa S (2005) Short-term prediction of travel time using neural networks on an interurban highway. Transportation 32:649–669
Jain E, Jain S (2014) Categorizing Twitter Users on the basis of their interests using Hadoop/Mahout Platform. In: Proceedings of the 9th international conference on industrial and information system
Joshi A, Monnier C, Betke M, Sclaroff S (2017) Comparing random forest approaches to segmenting and classifying gestures. Image Vision Comput 58:86–95
Kalambe YS, Pratiba D, Shah P (2015) Big data mining tools for unstructured data: a review. Int J Innov Technol Res 3(2):2012–2017
Khosravi A, Mazloumi E, Nahavandi S, Creighton D, van Lint JWC (2011) A genetic algorithm-based method for improving quality of travel time prediction intervals. Transp Res Part C 19:1364–1376
Li CS, Chen MC (2013) Identifying important variables for predicting travel time of freeway with non-recurrent congestion with neural networks. Neural Comput Appl 23:1611–1629
Li CS, Chen MC (2014) A data mining based approach for travel time prediction in freeway with non-recurrent congestion. Neurocomputing 133:74–83
Mistry P, Neagu D, Trundle PR, Vessey JD (2016) Using random forest and decision tree models for a new vehicle prediction approach in computational toxicology. Soft Comput 20:2967–2979
Qiao W, Haghani A, Shao C-F, Lui J (2016) Freeway path travel time prediction based on heterogeneous traffic data through nonparametric model. J Intell Transp Syst 20(5):438–448
Rio SD, Lopez V, Benitez JM, Herrera F (2014) On the use of MapReduce for imbalanced big data using Random Forest. Inf Sci 285:112–137
Singh K, Guntuku SC, Thakur K, Hota C (2014) Big data analytics framework for peer-to-peer botnet detection using random forests. Inf Sci 278:488–497
van Lint JWC (2006) Reliable real-time framework for short-term freeway travel time prediction. J Transp Eng 132(12):921–932
Vlahogianni EI, Karlaftis MG, Golias JC (2014) Short-term traffic forecasting: where we are and where we’re going. Transp Res Part C 43:3–19
Wu C-H, Ho J-M, Lee DT (2004) Travel-time prediction with support vector regression. IEEE Trans Intell Transp Syst 5(4):276–281
Xu Y, Zhang Q, Wang L (2016) Metric forests based on Gaussian mixture model for visual image classification. Soft Comput. doi:10.1007/s00500-016-2350-4
Yildirimoglu M, Geroliminis N (2013) Experienced travel time prediction for congested highways. Transp Res Part B 53:45–63
Zhang X, Rice JA (2003) Short-term travel time prediction. Transp Res Part C 11:187–210
Acknowledgements
This study was partially funded by the Ministry of Science and Technology (Taiwan) Grant: MOST 105-2221-E-027-052 -MY3.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
All the authors of this paper declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Communicated by Y. Ni.
Rights and permissions
About this article
Cite this article
Fan, SK.S., Su, CJ., Nien, HT. et al. Using machine learning and big data approaches to predict travel time based on historical and real-time data from Taiwan electronic toll collection. Soft Comput 22, 5707–5718 (2018). https://doi.org/10.1007/s00500-017-2610-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-017-2610-y