Factors That Influence the Type of Road Traffic Accidents: A Case Study in a District of Portugal
Abstract
:1. Introduction
2. Materials and Methods
2.1. Study Area
2.2. Data
- Crash —an RTA in which the driver loses control of the vehicle, being able to deviate or leave the traffic lane or carriageway in which it circulates and/or collide with other users of the public road or obstacles outside the carriageway (includes pavement, vehicle stops, poles, vertical and light signs and other road equipment or trees, rocks, etc.).
- Collision —crash resulting from a conflict situation between a moving vehicle and other vehicle(s) (moving, stopped or parked) or obstacles in the carriageway (includes eyelets, dividers, fences, safety guards, center plates and other road equipment or potholes, rocks, etc.).
- Pedestrian running over —accident resulting from a conflict situation between a moving vehicle and a pedestrian or animal. It does not include situations in which the pedestrian or animal contributed to the occurrence of the accident, but was not hit by the vehicle (there was no collision).
- Accidents and geographical variables: municipality, location, type, causes, occurring in parking.
- Road variables: type of road, type of roadside, road layout, type of lane, road conservation state, the existence of works on the road, the existence of light signals and the existence of pavement marks.
- Vehicles variables: age and type of the vehicle(s) involved in the RTA.
- Drivers variables: gender, age, the driver ran away from the RTA scene.
- Victims variables: types of victims and injury severity of the victim within 30 days.
- Meteorological variables: precipitation, temperature, wind speed, and if the weather is considered good, that is, there was no fog, no rain, no strong wind, no snow, no smoke cloud, no hail.
- Time variables: date and hour.
2.3. Methodology
2.3.1. Multinomial Logistic Model
2.3.2. Machine Learning
2.3.3. Oversampling Approach
2.3.4. Performance Metrics
3. Results
- Geographic factors: municipality, RTA in a parking area, and RTA located inside/outside an urban area;
- Time factors: month, day of the week and hour of the RTA;
- Weather factors: temperature, and weather conditions (good or rain/other conditions);
- Road characteristics factors: road layout, and type of road;
- Driver’s characteristics factors: % of male drivers, and age of the oldest driver;
- Vehicle’s characteristics factors: type of vehicle, and median vehicle’s age.
3.1. Statistical Multinomial Logit Model
3.2. Machine Learning Algorithms
3.3. Comparison of Models
4. Final Remarks
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
AE | Highway |
ANSR | Autoridade Nacional de Segurança Rodoviária (National Road Safety Authority) |
BEAV | Statistical Bulletin of Road Traffic Accidents |
EN | National Road |
EU | European Union |
GDP | Gross Domestic Product |
GNR | Guarda Nacional Republicana (National Republican Guard) |
IC | Complementary Itinerary |
IP | Principal Itinerary |
IPMA | Instituto Português do Mar e da Atmosfera (Portuguese Institute for Sea and |
Atmosphere) | |
KNN | K-Nearest Neighbor |
MCC | Matthew’s Correlation Coefficient |
ML | Machine Learning |
MOPREVIS | Modeling and Prediction of Road Traffic Accidents in the District of Setúbal |
RF | Random Forest |
7 ROSE | Random Over-Sampling Examples |
RTA | Road Traffic Accident |
SMOTE | Synthetic Minority Oversampling Technique |
SVM | Support Vector Machine |
TC-GNR | Territorial Command of the GNR |
References
- World Health Organization. Projections of Mortality and Causes of Death, 2015 and 2030. 2013. Available online: https://www.who.int/healthinfo/global_burden_disease/projections2015_2030/en/ (accessed on 25 January 2022).
- Eurostat. Road Accidents: Number of Fatalities Continues Falling. 2021. Available online: https://ec.europa.eu/eurostat/en/web/products-eurostat-news/-/ddn-20210624-1 (accessed on 25 January 2022).
- Lusa. Sinistralidade Rodoviária Tem Impacto Económico e Social Negativo de 1,2% do PIB—Governo. 2018. Available online: https://www.rtp.pt/noticias/pais/sinistralidade-rodoviaria-tem-impacto-economico-e-social-negativo-de-12-do-pib-governo_n1112193 (accessed on 25 January 2022).
- Kim, D.G.; Washington, S.; Oh, J. Modeling crash types: New insights into the effects of covariates on crashes at rural intersections. J. Transp. Eng. 2006, 132, 282–292. [Google Scholar] [CrossRef]
- Infante, P.; Jacinto, G.; Afonso, A.; Rego, L.; Nogueira, V.; Quaresma, P.; Saias, J.; Santos, D.; Nogueira, P.; Silva, M.; et al. Comparison of statistical and machine-learning models on road traffic accident severity classification. Computers 2022, 11, 80. [Google Scholar] [CrossRef]
- Zhang, J.; Li, Z.; Pu, Z.; Xu, C. Comparing prediction performance for crash injury severity among various machine learning and statistical methods. IEEE Access 2018, 6, 60079–60087. [Google Scholar] [CrossRef]
- Rezapour, M.; Moomen, M.; Ksaibati, K. Ordered logistic models of influencing factors on crash injury severity of single and multiple-vehicle downgrade crashes: A case study in Wyoming. J. Saf. Res. 2019, 68, 107–118. [Google Scholar] [CrossRef] [PubMed]
- Fiorentini, N.; Losa, M. Handling imbalanced data in road crash severity prediction by machine learning algorithms. Infrastructures 2020, 5, 61. [Google Scholar] [CrossRef]
- Silva, P.B.; Andrade, M.; Ferreira, S. Machine learning applied to road safety modeling: A systematic literature review. J. Traffic Transp. Eng. 2020, 7, 775–790. [Google Scholar] [CrossRef]
- Aidoo, E.N.; Amoh-Gyimah, R.; Ackaah, W. The effect of road and environmental characteristics on pedestrian hit-and-run accidents in Ghana. Accid. Anal. Prev. 2013, 53, 23–27. [Google Scholar] [CrossRef]
- Geedipally, S.R.; Patil, S.; Lord, D. Examination of methods to estimate crash counts by collision type. Transp. Res. Rec. 2010, 2165, 12–20. [Google Scholar] [CrossRef]
- Bham, G.H.; Javvadi, B.S.; Manepalli, U.R. Multinomial logistic regression model for single-vehicle and multivehicle collisions on urban US highways in Arkansas. J. Transp. Eng. 2012, 138, 786–797. [Google Scholar] [CrossRef]
- Chen, Y.; Wang, K.; King, M.; He, J.; Ding, J.; Shi, Q.; Wang, C.; Li, P. Differences in factors affecting various crash types with high numbers of fatalities and injuries in China. PLoS ONE 2016, 11, e0158559. [Google Scholar] [CrossRef] [Green Version]
- Intini, P.; Berloco, N.; Fonzone, A.; Fountas, G.; Ranieri, V. The influence of traffic, geometric and context variables on urban crash types: A grouped random parameter multinomial logit approach. Anal. Methods Accid. Res. 2020, 28, 100141. [Google Scholar] [CrossRef]
- Iranitalab, A.; Khattak, A. Comparison of four statistical and machine learning methods for crash severity prediction. Accid. Anal. Prev. 2017, 108, 27–36. [Google Scholar] [CrossRef] [PubMed]
- Christoforou, Z.; Cohen, S.; Karlaftis, M.G. Identifying crash type propensity using real-time traffic data on freeways. J. Saf. Res. 2011, 42, 43–50. [Google Scholar] [CrossRef]
- Boo, Y.; Choi, Y. Comparison of Prediction Models for Mortality Related to Injuries from Road Traffic Accidents after Correcting for Undersampling. Int. J. Environ. Res. Public Health 2021, 18, 5604. [Google Scholar] [CrossRef]
- Guo, M.; Zhao, X.; Yao, Y.; Yan, P.; Su, Y.; Bi, C.; Wu, D. A study of freeway crash risk prediction and interpretation based on risky driving behavior and traffic flow data. Accid. Anal. Prev. 2021, 160, 106328. [Google Scholar] [CrossRef] [PubMed]
- Ding, H.; Lu, Y.; Sze, N.; Chen, T.; Guo, Y.; Lin, Q. A deep generative approach for crash frequency model with heterogeneous imbalanced data. Anal. Methods Accid. Res. 2022, 34, 100212. [Google Scholar] [CrossRef]
- Yu, R.; Wang, Y.; Zou, Z.; Wang, L. Convolutional neural networks with refined loss functions for the real-time crash risk analysis. Transp. Res. Part C Emerg. Technol. 2020, 119, 102740. [Google Scholar] [CrossRef]
- Rella Riccardi, M.; Mauriello, F.; Sarkar, S.; Galante, F.; Scarano, A.; Montella, A. Parametric and Non-Parametric Analyses for Pedestrian Crash Severity Prediction in Great Britain. Sustainability 2022, 14, 3188. [Google Scholar] [CrossRef]
- Vilaça, M.; Macedo, E.; Coelho, M.C. A Rare Event Modelling Approach to Assess Injury Severity Risk of Vulnerable Road Users. Safety 2019, 5, 29. [Google Scholar] [CrossRef] [Green Version]
- Rella Riccardi, M.; Galante, F.; Scarano, A.; Montella, A. Econometric and Machine Learning Methods to Identify Pedestrian Crash Patterns. Sustainability 2022, 14, 15471. [Google Scholar] [CrossRef]
- ANSR. Manual de Prenchimento. Boletim Estatístico de Acidente de Viação. 2013. Available online: http://www.ansr.pt/Estatisticas/BEAV/Documents/MANUALPREENCHIMENTOBEAV.pdf (accessed on 25 January 2022).
- Hosmer Jr, D.W.; Lemeshow, S.; Sturdivant, R.X. Applied Logistic Regression; John Wiley & Sons: Hoboken, NJ, USA, 2013; Volume 398. [Google Scholar]
- Menardi, G.; Torelli, N. Training and assessing classification rules with imbalanced data. Data Min. Knowl. 2014, 28, 92–122. [Google Scholar] [CrossRef]
- He, H.; Garcia, E.A. Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 2009, 21, 1263–1284. [Google Scholar] [CrossRef]
- Gorodkin, J. Comparing two K-category assignments by a K-category correlation coefficient. Comput. Biol. Chem. 2004, 28, 367–374. [Google Scholar] [CrossRef] [PubMed]
- R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2021. [Google Scholar]
- Croissant, Y. Estimation of Random Utility Models in R: The mlogit Package. J. Stat. Softw. 2020, 95, 1–41. [Google Scholar] [CrossRef]
- Kuhn, M. Building Predictive Models in R Using the caret Package. J. Stat. Softw. 2008, 28, 1–26. [Google Scholar] [CrossRef] [Green Version]
- Yan, Y.; MLmetrics: Machine Learning Evaluation Metrics. R Package Version 1.1.1. 2016. Available online: https://CRAN.R-project.org/package=MLmetrics (accessed on 1 December 2022).
- Lunardon, N.; Menardi, G.; Torelli, N. ROSE: A Package for Binary Imbalanced Learning. R J. 2014, 6, 82–92. [Google Scholar] [CrossRef] [Green Version]
Variable | Categories (or Mean ± SD) | n |
---|---|---|
Accident and geographical variables | ||
Municipality | Setúbal/Alcochete/Seixal/Montijo | 7865 |
Alcácer do Sal | 1184 | |
Sesimbra/Almada | 7760 | |
Moita/sines | 2876 | |
Palmela/Barreiro | 5565 | |
Grândola | 1195 | |
Santiago do Cacém | 1557 | |
Accident location | Inside urban area | 19,321 |
Outside urban area | 8681 | |
Causes | Distraction | 3452 |
Irregular manoeuvre | 868 | |
Lack of dexterity | 807 | |
Disregard of vertical signs | 803 | |
Disregard of safety distances | 588 | |
Excessive speed | 555 | |
Alcohol influence | 343 | |
Other reasons | 882 | |
Occurring in a parking | Yes | 846 |
No | 27156 | |
Road variables | ||
Type of road | Highway/Bridge | 2411 |
EN | 5308 | |
IC/IP | 1776 | |
Other type | 18,507 | |
Type of roadside | Paved | 3833 |
Unpaved or non-existent | 4289 | |
Road layout | Curve | 4302 |
Straight | 23,650 | |
Type of lane | With central separation | 4400 |
Without central separation | 23,601 | |
Road conservation state | Good | 4613 |
Regular/bad | 3562 | |
Existence of works on the road | Yes | 190 |
No | 7389 | |
Existence of light signals | Yes | 456 |
No | 7070 | |
Existence of pavement marks | Yes | 4854 |
No | 2912 | |
Vehicle variables | ||
Median vehicle age | 11.72 ± 6.04 | 27,624 |
Type of the vehicle(s) involved | Light passenger | 23,272 |
Motorcycles but not heavy vehicles | 2655 | |
Heavy | 1964 | |
Driver variables | ||
Age of the oldest driver (years) | ≤20 | 616 |
(20, 25] | 1311 | |
(25, 30] | 1501 | |
(30, 40] | 4415 | |
(40, 55] | 8854 | |
(55, 82] | 9465 | |
>82 | 126 | |
≥50% male drivers | Yes | 21,753 |
No | 4867 | |
Driver ran away from the RTA scene | Yes | 3987 |
No | 24,015 | |
Victims variables | ||
Severity | Fatal | 163 |
Serious injuries | 404 | |
Minor injuries | 5404 | |
Property damages | 22,033 | |
Drivers injured or dead | No | 1291 |
Yes | 4678 | |
Number of passengers | 0 | 4288 |
≥1 | 1651 | |
Number of pedestrian | 0 | 5345 |
≥1 | 624 | |
Meteorological variables | ||
Temperature (°C) | <17 or | 21,403 |
4965 | ||
1634 | ||
Precipitation (mm/min) | 0.06 ± 0.73 | 28,002 |
Wind velocity (m/s) | 1.16 ± 0.98 | 28,002 |
Good weather | Yes | 24,327 |
No | 3627 | |
Time variables | ||
Month | Feb./Mar./May/Jul./Sep. | 11,546 |
Oct. to Jan. | 9264 | |
Apr./Jun./Aug. | 7192 | |
Day of the week | Thursday/Friday | 8454 |
Weekend | 7446 | |
Monday to Wednesday | 12,102 | |
Hour of the day | 6–10 p.m. | 6961 |
0–1 a.m./ 2 a.m. | 12,436 | |
2 a.m. | 3395 | |
4 a.m./ 5–7 a.m./ 11 a.m.–1 p.m./ 4–5 p.m./ 11 p.m. | 4781 | |
2–3 p.m. | 254 | |
8–10 a.m. | 175 |
Collision | Crash | |||||
---|---|---|---|---|---|---|
Variable | Coef. | St. Err. | OR | Coef. | St. Err. | OR |
Intercept | 1.28 *** | 0.33 | 3.36 | 2.22 *** | 0.34 | 8.13 |
Municipality: ref. Setúbal/Alcohete/Seixal/Montijo | ||||||
Alcácer do Sal | −0.68 *** | 0.21 | 0.50 | 0.80 *** | 0.21 | 2.19 |
Sesimbra/Almada | −0.25 * | 0.11 | 0.78 | −0.51 *** | 0.12 | 0.60 |
Palmela/Barreiro | −0.39 *** | 0.11 | 0.62 | −0.10 | 0.12 | 0.84 |
Grândola | −0.93 *** | 0.18 | 0.36 | −0.03 | 0.19 | 0.94 |
Santiago do Cacém | −1.39 *** | 0.13 | 0.25 | −0.89 *** | 0.14 | 0.42 |
Accident location: ref. Inside urban area | ||||||
Outside urban area | 0.10 | 0.12 | 1.18 | 0.98 *** | 0.12 | 2.80 |
Occurred in a parking: ref. No | ||||||
Yes | 0.89 ** | 0.30 | 2.51 | −0.35 | 0.34 | 0.69 |
Temperature °C: ref. or | ||||||
−0.28 *** | 0.09 | 0.70 | −0.23* | 0.09 | 0.73 | |
1.00 *** | 0.27 | 2.81 | 0.88** | 0.28 | 2.34 | |
Good wheather: ref. Yes | ||||||
No | −0.01 | 0.11 | 0.93 | 0.92 *** | 0.11 | 2.32 |
Day of the week: ref. Thursday/Friday | ||||||
Weekend | 0.31 ** | 0.10 | 1.40 | 0.62 *** | 0.10 | 1.97 |
Monday to Wednesday | 0.19 * | 0.08 | 1.20 | 0.23 ** | 0.09 | 1.29 |
Month: ref. Feb./Mar./May/Jul./Sep. | ||||||
Oct. to Jan. | −0.30 *** | 0.08 | 0.76 | −0.36 *** | 0.09 | 0.72 |
Apr./Jun./Aug. | 0.23 * | 0.10 | 1.26 | 0.28 ** | 0.10 | 1.34 |
Hour of the day: ref. 6–10 p.m. | ||||||
0–1 a.m./ 2 a.m. | 0.21 * | 0.08 | 1.14 | 0.58 *** | 0.09 | 1.81 |
2 a.m. | −1.09 *** | 0.26 | 0.33 | 0.22 | 0.27 | 0.99 |
4 a.m./ 5–7 a.m./ 11 a.m.–1 p.m./ 4–5 p.m./ 11 p.m. | −0.77 | 0.41 | 0.83 | 1.08 ** | 0.41 | 5.50 |
2–3 p.m. | 0.82 *** | 0.15 | 2.53 | 1.04 *** | 0.16 | 3.42 |
8–10 a.m. | 0.22 * | 0.10 | 1.17 | 0.24 * | 0.12 | 1.29 |
Road layout: ref. Curve | ||||||
Straight | −0.37 ** | 0.13 | 0.75 | −1.46 *** | 0.13 | 0.25 |
Type of road: ref. Highway/bridge | ||||||
EN | 0.57 *** | 0.15 | 1.75 | 0.003 | 0.15 | 0.98 |
IC/IP | 0.57 ** | 0.18 | 1.82 | −0.12 | 0.19 | 0.92 |
Other type | 0.71 *** | 0.16 | 2.14 | 0.17 | 0.16 | 1.24 |
male drivers: ref. No | ||||||
Yes | 0.61 *** | 0.08 | 1.89 | −0.07 | 0.09 | 0.97 |
Oldest driver: ref. years old | ||||||
0.19 | 0.26 | 1.28 | −0.46 | 0.26 | 0.64 | |
0.57 * | 0.26 | 1.96 | −0.73 ** | 0.26 | 0.5 | |
0.69 ** | 0.24 | 2.11 | −1.25 *** | 0.24 | 0.30 | |
1.08 *** | 0.23 | 3.01 | −1.37 *** | 0.23 | 0.24 | |
1.49 *** | 0.23 | 4.91 | −1.34 *** | 0.23 | 0.27 | |
0.58 | 0.48 | 1.82 | −1.30 * | 0.51 | 0.26 | |
Type of vehicle: ref. Light passenger vehicle | ||||||
Motorcycles but not heavy vehicles | 0.64 *** | 0.17 | 1.78 | 1.53 *** | 0.18 | 4.49 |
Heavy vehicles | 1.02 *** | 0.21 | 2.73 | 0.52 * | 0.22 | 1.87 |
Median vehicle’s age | 0.01 | 0.01 | 1.00 | 0.06 *** | 0.01 | 1.06 |
ML Algorithms | ||||||
---|---|---|---|---|---|---|
Measure | Mult. | RF | SVM | Naive-Bayes | C5.0 | KNN |
Accuracy | 0.818 | 0.797 | 0.788 | 0.744 | 0.811 | 0.802 |
Sensitivity (run. over) | 0.000 | 0.006 | 0.000 | 0.118 | 0.006 | 0.011 |
Sensitivity (collision) | 0.964 | 0.931 | 0.958 | 0.842 | 0.964 | 0.978 |
Sensitivity (crash) | 0.369 | 0.382 | 0.219 | 0.450 | 0.318 | 0.208 |
Specificity (run. over) | 1.000 | 0.994 | 1.000 | 0.947 | 0.999 | 0.999 |
Specificity (collision) | 0.324 | 0.353 | 0.204 | 0.480 | 0.292 | 0.192 |
Specificity (crash) | 0.961 | 0.932 | 0.955 | 0.889 | 0.959 | 0.975 |
Balanced Acc. Weigh. | 0.818 | 0.797 | 0.788 | 0.744 | 0.811 | 0.801 |
Macro F1 | - | 0.457 | - | 0.469 | 0.519 | 0.511 |
MCC | 0.391 | 0.336 | 0.240 | 0.298 | 0.351 | 0.286 |
Cohen Kappa | 0.354 | 0.321 | 0.203 | 0.398 | 0.312 | 0.223 |
ML Algorithms | ||||||
---|---|---|---|---|---|---|
Measure | Mult. | RF | SVM | Naive-Bayes | C5.0 | KNN |
Accuracy | 0.578 | 0.881 | 0.570 | 0.523 | 0.858 | 0.666 |
Sensitivity (run. over) | 0.140 | 0.982 | 0.120 | 0.420 | 0.970 | 0.863 |
Sensitivity (collision) | 0.715 | 0.806 | 0.727 | 0.681 | 0.787 | 0.548 |
Sensitivity (crash) | 0.651 | 0.907 | 0.630 | 0.413 | 0.876 | 0.688 |
Specificity (run. over) | 0.960 | 0.978 | 0.960 | 0.806 | 0.973 | 0.840 |
Specificity (collision) | 0.586 | 0.940 | 0.558 | 0.591 | 0.919 | 0.850 |
Specificity (crash) | 0.759 | 0.891 | 0.776 | 0.871 | 0.880 | 0.807 |
Balanced Acc. Weigh. | 0.577 | 0.881 | 0.570 | 0.523 | 0.858 | 0.666 |
Macro F1 | 0.524 | 0.893 | 0.512 | 0.512 | 0.871 | 0.680 |
MCC | 0.316 | 0.816 | 0.305 | 0.270 | 0.780 | 0.501 |
Cohen Kappa | 0.306 | 0.814 | 0.293 | 0.262 | 0.779 | 0.493 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Infante, P.; Jacinto, G.; Afonso, A.; Rego, L.; Nogueira, P.; Silva, M.; Nogueira, V.; Saias, J.; Quaresma, P.; Santos, D.; et al. Factors That Influence the Type of Road Traffic Accidents: A Case Study in a District of Portugal. Sustainability 2023, 15, 2352. https://doi.org/10.3390/su15032352
Infante P, Jacinto G, Afonso A, Rego L, Nogueira P, Silva M, Nogueira V, Saias J, Quaresma P, Santos D, et al. Factors That Influence the Type of Road Traffic Accidents: A Case Study in a District of Portugal. Sustainability. 2023; 15(3):2352. https://doi.org/10.3390/su15032352
Chicago/Turabian StyleInfante, Paulo, Gonçalo Jacinto, Anabela Afonso, Leonor Rego, Pedro Nogueira, Marcelo Silva, Vitor Nogueira, José Saias, Paulo Quaresma, Daniel Santos, and et al. 2023. "Factors That Influence the Type of Road Traffic Accidents: A Case Study in a District of Portugal" Sustainability 15, no. 3: 2352. https://doi.org/10.3390/su15032352
APA StyleInfante, P., Jacinto, G., Afonso, A., Rego, L., Nogueira, P., Silva, M., Nogueira, V., Saias, J., Quaresma, P., Santos, D., Góis, P., & Manuel, P. R. (2023). Factors That Influence the Type of Road Traffic Accidents: A Case Study in a District of Portugal. Sustainability, 15(3), 2352. https://doi.org/10.3390/su15032352