Abstract
The lack of personal and economic attributes in emerging public transit big data (such as smart card data) is a general issue that needs to be addressed. Passengers in the public transit network are from different socioeconomic classes, and their trip attributes usually depend on their personal and economic attributes. For instance, age as a demographic attribute plays an important role in trip attributes; adolescent passengers travel to school, young professionals travel to work, and old passengers travel to medical facilities more often. Relations between the socioeconomic and trip attributes of the passengers can be examined by developing a Bayesian network that represents the relations between the attributes by directed acyclic graphs, and calculating the joint and conditional probability values in the graph. This study infers the socioeconomic attributes of the public transit passengers from the trip attributes through developing a Bayesian network. Considered socioeconomic attributes are age, gender, and income; considered trip attributes are start time and duration of the trip, stay duration, and available origin and destination land use types. First, potential structures of the Bayesian network are examined by comparing network scores and arc strength test. After learning the network’s parameters, the reasoning is done through both prediction and diagnosis in the network. Also, the most likely combinations of the socioeconomic and trip attributes are discovered. The case study for developing the Bayesian network is a Household Travel Survey dataset from Queensland, Australia, that contains both socioeconomic and trip attributes. Results clearly show how the socioeconomic attributes can be inferred from the trip attributes. Discovered probability distributions can be used to enrich the smart card datasets with the socioeconomic attributes. Moreover, the Bayesian classifier is applied to the dataset to validate the capability of the model in predicting the socioeconomic attributes. In the end, the developed network is implemented on a set of smart card records to discuss the potential applications.
Similar content being viewed by others
References
Aletras N, Chamberlain BP (2018) Predicting twitter user socioeconomic attributes with network and language information. In: Proceedings of the 29th on hypertext and social media, ACM, New York, pp 20–24
Bozdogan H (1987) Model selection and Akaike’s information criterion (AIC): the general theory and its analytical extensions. Psychometrika 52(3):345–370
Brunsdon C, Charlton M, Rigby JE (2018) An open source geodemographic classification of small areas in the Republic of Ireland. Appl Spat Anal Policy 11(2):183–204
Buntine W (1996) A guide to the literature on learning probabilistic networks from data. IEEE Trans Knowl Data Eng 8(2):195–210
Chen C, Zhang G, Wang H, Yang J, Jin PJ, Walton CM (2015) Bayesian network-based formulation and analysis for toll road utilization supported by traffic information provision. Transp Res Part C Emerg Technol 60:339–359
ChickeringDM, Heckerman D, Meek C (1997) A Bayesian approach to learning Bayesian networks with local structure. In: Proceedings of the thirteenth conference on uncertainty in artificial intelligence, Morgan Kaufmann Publishers Inc., pp 80–89
Cooper GF, Herskovits E (1992) A Bayesian method for the induction of probabilistic networks from data. Mach Learn 9(4):309–347
Corman F, Kecman P (2018) Stochastic prediction of train delays in real-time using Bayesian networks. Transp Res Part C Emerg Technol 95:599–615
Farber S, Marino MG (2017) Transit accessibility, land development and socioeconomic priority: a typology of planned station catchment areas in the Greater Toronto and Hamilton Area. J Transp Land Use 10(1):879–902
Faroqi H, Mesbah M, Kim J (2018a) Applications of transit smart cards beyond a fare collection tool: a literature review. Adv Transp Stud 45:105–122
Faroqi H, Mesbah M, Kim J, Tavassoli A (2018b) A model for measuring activity similarity between public transit passengers using smart card data. Travel Behav Soc 13:11–25
Faroqi H, Mesbah M, Kim J (2018) Inferring socioeconomic attributes of public transit passengers using classifiers. In: Proceedings of the 40th Australian transport research forum (ATRF)
Foygel R, Drton M (2010) Extended Bayesian information criteria for Gaussian graphical models. In: Advances in neural information processing systems, pp 604–612
Friedman N, Koller D (2003) Being Bayesian about network structure. A Bayesian approach to structure discovery in Bayesian networks. Mach Learn 50(1–2):95–125
Friedman N, Geiger D, Goldszmidt M (1997) Bayesian network classifiers. Mach Learn 29(2–3):131–163
Gregoriades A, Mouskos KC (2013) Black spots identification through a Bayesian Networks quantification of accident risk index. Transp Res Part C Emerg Technol 28:28–43
Grossman D, Domingos P, Domingos P (2004) Learning Bayesian network classifiers by maximizing conditional likelihood. In: Proceedings of the twenty-first international conference on machine learning, ACM, p 46
Kim J, Wang G (2016) Diagnosis and prediction of traffic congestion on urban road networks using Bayesian networks. Transp Res Rec 2595(1):108–118
Korb KB, Nicholson AE (2010) Bayesian artificial intelligence. CRC Press, New York
Lampos V, Aletras N, Geyti JK, Zou B, Cox IJ (2016) Inferring the socioeconomic status of social media users based on behaviour and language. In: European conference on information retrieval, Springer, Cham, pp 689–695
Luo S, Morone F, Sarraute C, Travizano M, Makse HA (2017) Inferring personal economic status from social network location. Nat Commun 8:15227
Maghrebi M, Waller ST (2014) Exploring experts decisions in concrete delivery dispatching systems using Bayesian network learning techniques. In: 2014 2nd international conference on artificial intelligence, modelling and simulation, IEEE, pp 103–108
Neff J, Pham L (2007). A profile of public transportation passenger demographics and travel characteristics reported in on-board surveys
Nielsen TD, Jensen FV (2009) Bayesian networks and decision graphs. Springer Science & Business Media, Berlin
Pascale A, Nicoli M (2011) Adaptive Bayesian network for traffic flow prediction. In: 2011 IEEE statistical signal processing workshop (SSP), IEEE, pp 177–180
Pearl J (2014) Probabilistic reasoning in intelligent systems: networks of plausible inference. Elsevier, Amsterdam
Powers DM (2011) Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation
Samaranayake S, Blandin S, Bayen A (2011) Learning the dependency structure of highway networks for traffic forecast. In: 2011 50th IEEE conference on decision and control and European control conference, IEEE, pp 5983–5988
Scutari M, Denis JB (2014) Bayesian networks: with examples in R. Chapman and Hall/CRC, New York
Sun L, Lu Y, Jin JG, Lee DH, Axhausen KW (2015) An integrated Bayesian approach for passenger flow assignment in metro networks. Transp Res Part C Emerg Technol 52:116–131
Tao X, Fu Z, Comber AJ (2019) An Analysis of Modes of Commuting in Urban and Rural Areas. Appl Spat Anal Policy 12(4):831–845
Vega A, Kilgarriff P, O’Donoghue C, Morrissey K (2017) The spatial impact of commuting on income: a spatial microsimulation approach. Appl Spat Anal Policy 10(4):475–495
Wang D, Chai Y (2009) The jobs–housing relationship and commuting in Beijing, China: the legacy of Danwei. J Transp Geogr 17(1):30–38
Yaakub N, Napiah M (2011) Public bus passenger demographic and travel characteristics a study of public bus passenger profile in Kota Bharu, Kelantan. In: 2011 national postgraduate conference, IEEE, pp 1–6
Yang S, Chang KC (2002) Comparison of score metrics for Bayesian network learning. IEEE Trans Syst Man Cybernet Part A Syst Hum 32(3):419–428
Yu YJ, Cho MG (2008) A short-term prediction model for forecasting traffic information using Bayesian network. In: 2008 third international conference on convergence and hybrid information technology, IEEE, vol 1, pp 242–247
Zhang Y, Cheng T (2018) Inferring social-demographics of travellers based on smart card data. In: 2nd international conference on advanced research methods and analytics (CARMA 2018), Editorial Universitat Politècnica de València, pp 55–62
Zhang K, Taylor MA (2006) Effective arterial road incident detection: a Bayesian network based algorithm. Transp Res Part C Emerg Technol 14(6):403–417
Zhao P, Lü B, De Roo G (2011) Impact of the jobs-housing balance on urban commuting in Beijing in the transformation era. J Transp Geogr 19(1):59–69
Zhu Z, Li Z, Liu Y, Chen H, Zeng J (2017) The impact of urban characteristics and residents’ income on commuting in China. Transp Res Part D Transp Environ 57:474–483
Zhu Y, Chen F, Li M, Wang Z (2018) Inferring the economic attributes of urban rail transit passengers based on individual mobility using multisource data. Sustainability 10(11):4178
Funding
This study is not funded by any organization.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Faroqi, H., Mesbah, M. & Kim, J. Modelling socioeconomic attributes of public transit passengers. J Geogr Syst 22, 519–543 (2020). https://doi.org/10.1007/s10109-020-00328-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10109-020-00328-0