Abstract
Software developers typically insert logging statements in their source code to record runtime information. However, providing proper logging statements remains a challenging task. Prior approaches automatically enhance logging statements, as a post-implementation process. Such automatic approaches do not take into account developers’ domain knowledge; nevertheless, developers usually need to carefully design the logging statements since logs are a rich source about the field operation of a software system. The goals of this paper include: i) understanding the reasons for log changes; and ii) proposing an approach that can provide developers with log change suggestions as soon as they commit a code change, which we refer to as “just-in-time” suggestions for log changes. In particular, we derive a set of measures based on manually examining the reasons for log changes and our experiences. We use these measures as explanatory variables in random forest classifiers to model whether a code commit requires log changes. These classifiers can provide just-in-time suggestions for log changes. We perform a case study on four open source projects: Hadoop, Directory Server, Commons HttpClient, and Qpid. We find that: (i) The reasons for log changes can be grouped along four categories: block change, log improvement, dependence-driven change, and logging issue; (ii) our random forest classifiers can effectively suggest whether a log change is needed: the classifiers that are trained from within-project data achieve a balanced accuracy of 0.76 to 0.82, and the classifiers that are trained from cross-project data achieve a balanced accuracy of 0.76 to 0.80; (iii) the characteristics of code changes in a particular commit and the current snapshot of the source code are the most influential factors for determining the likelihood of a log change in a commit.
Similar content being viewed by others
Notes
Splunk. http://www.splunk.com/
XpoLog. http://www.xpolog.com/
Logstash. http://logstash.net/
References
Apache-Commons (2016) Apache commons logging user guide - best practices. http://commons.apache.org/proper/commons-logging/guide.html
Bitincka L, Ganapathi A, Sorkin S, Zhang S (2010) Optimizing data analysis with a semi-structured time series database. In: Proceedings of the 2010 Workshop on Managing Systems via Log Analysis and Machine Learning Techniques, SLAML’10, pp 7–7
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Breiman L (2002) Manual on setting up, using, and understanding random forests v3.1. http://oz.berkeley.edu/users/breiman/Using_random_forests_V3
Cohen I, Goldszmidt M, Kelly T, Symons J, Chase JS (2004) Correlating instrumentation data to system states: A building block for automated diagnosis and control. In: Proceedings of the 6th Conference on Symposium on Opearting Systems Design & Implementation - Volume 6, OSDI’ 04, pp 16–16
D’Ambros M, Lanza M, Robbes R (2012) Evaluating defect prediction approaches: a benchmark and an extensive comparison. Empirical Software Engineering 17(4-5):531–577
Efron B (1979) Bootstrap methods: another look at the jackknife. Ann Stat 7 (1):1–26
Fu Q, Lou JG, Wang Y, Li J (2009) Execution anomaly detection in distributed systems through unstructured log analysis. In: Proceedings of the 2009 Ninth IEEE International Conference on Data Mining, ICDM ’09, pp 149–158
Fu Q, Lou JG, Lin Q, Ding R, Zhang D, Xie T (2013) Contextual analysis of program logs for understanding system behaviors. In: Proceedings of the 10th Working Conference on Mining Software Repositories, MSR ’13, pp 397–400
Fu Q, Zhu J, Hu W, Lou JG, Ding R, Lin Q, Zhang D, Xie T (2014) Where do developers log? An empirical study on logging practices in industry. In: Companion Proceedings of the 36th International Conference on Software Engineering, ICSE Companion ’14, pp 24–33
Fukushima T, Kamei Y, McIntosh S, Yamashita K, Ubayashi N (2014) An empirical study of just-in-time defect prediction using cross-project models. In: Proceedings of the 11th Working Conference on Mining Software Repositories, MSR, vol 2014, pp 172–181
Ghotra B, McIntosh S, Hassan AE (2015) Revisiting the impact of classification techniques on the performance of defect prediction models. In: Proceedings of the 37th International Conference on Software Engineering - Volume 1, ICSE ’15, pp 789–800
Glerum K, Kinshumann K, Greenberg S, Aul G, Orgovan V, Nichols G, Grant D, Loihle G, Hunt G (2009) Debugging in the (very) large: Ten years of implementation and experience. In: Proceedings of the ACM SIGOPS 22Nd Symposium on Operating Systems Principles, SOSP ’09, pp 103–116
Gülcü C, Stark S (2003) The complete log4j manual. QOS.CH, Lausanne, Switzerland
Jelihovschi EG, Faria JC, Allaman IB (2014) Scottknott: A package for performing the scott-knott clustering algorithm in R. Trends in Applied and Computational Mathematics 15(1):3–17
Kabinna S, Bezemer CP, Shang W, Hassan AE (2016) Logging library migrations: a case study for the apache software foundation projects. In: Proceedings of the 13th International Conference on Mining Software Repositories, MSR ’16, pp 154–164
Kamei Y, Shihab E, Adams B, Hassan AE, Mockus A, Sinha A, Ubayashi N (2013) A large-scale empirical study of just-in-time quality assurance. IEEE Transactions on Software Engineering 39(6):757–773
Kamei Y, Fukushima T, McIntosh S, Yamashita K, Ubayashi N, Hassan AE (2016) Studying just-in-time defect prediction using cross-project models. Empir Softw Eng 21(5):2072–2106
Kavulya S, Tan J, Gandhi R, Narasimhan P (2010) An analysis of traces from a production mapreduce cluster. In: Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing, CCGRID ’10, pp 94–103
Levenshtein VI (1966) Binary codes capable of correcting deletions, insertions, and reversals. Soviet physics doklady 10(8):707–710
Liaw A, Wiener M (2002) Classification and regression by randomforest. R news 2(3):18–22
Mariani L, Pastore F (2008) Automated identification of failure causes in system logs. In: Proceedings of the 2008 19th International Symposium on Software Reliability Engineering, ISSRE ’08, pp 117–126
Mariani L, Pastore F, Pezze M (2009) A toolset for automated failure analysis. In: Proceedings of the 31st International Conference on Software Engineering, ICSE ’09, pp 563–566
McIntosh S, Kamei Y, Adams B, Hassan AE (2014) The impact of code review coverage and code review participation on software quality: A case study of the qt, vtk, and itk projects. In: Proceedings of the 11th Working Conference on Mining Software Repositories, MSR ’14, pp 192–201
Microsoft-MSDN (2016) Logging an exception. https://msdn.microsoft.com/en-us/library/ff664711(v=pandp.50).aspx
Nagappan N, Ball T (2007) Using software dependencies and churn metrics to predict field failures: An empirical case study. In: Proceedings of the First International Symposium on Empirical Software Engineering and Measurement, ESEM ’07, pp 364–373
Nagappan N, Ball T, Zeller A (2006) Mining metrics to predict component failures. In: Proceedings of the 28th International Conference on Software Engineering, ICSE ’06, pp 452–461
Oliner A, Ganapathi A, Xu W (2012) Advances and challenges in log analysis. Commun ACM 55(2):55–61
Scott A, Knott M (1974) A cluster analysis method for grouping means in the analysis of variance. Biometrics 30(3):507–512
Shang W, Jiang ZM, Adams B, Hassan AE, Godfrey MW, Nasser M, Flora P (2011) An exploratory study of the evolution of communicated information about the execution of large software systems. In: Proceedings of the 18th Working Conference on Reverse Engineering, WCRE ’11, pp 335–344
Shang W, Jiang ZM, Adams B, Hassan AE, Godfrey MW, Nasser M, Flora P (2014a) An exploratory study of the evolution of communicated information about the execution of large software systems. J Soft: Evolution and Process 26(1):3–26
Shang W, Nagappan M, Hassan AE, Jiang ZM (2014b) Understanding log lines using development knowledge. In: Proceedings of the 30th IEEE International Conference on Software Maintenance and Evolution, ICSME ’14, pp 21–30
Shang W, Nagappan M, Hassan AE (2015) Studying the relationship between logging characteristics and the code quality of platform software. Empirical Softw Engg 20(1):1–27
Sharma B, Chudnovsky V, Hellerstein JL, Rifaat R, Das CR (2011) Modeling and synthesizing task placement constraints in google compute clusters. In: Proceedings of the 2Nd ACM Symposium on Cloud Computing, SOCC ’11, pp 3:1–3:14
Syer MD, Jiang ZM, Nagappan M, Hassan AE, Nasser M, Flora P (2013) Leveraging performance counters and execution logs to diagnose memory-related performance issues. In: Proceedings of the 29th IEEE International Conference on Software Maintenance, ICSM ’13:, pp 110–119
Tantithamthavorn C, McIntosh S, Hassan A, Matsumoto K (2016) An empirical comparison of model validation techniques for defect prediction models. IEEE Trans Softw Eng PP(99):1–1
Tourani P, Adams B (2016) The impact of human discussions on just-in-time quality assurance: An empirical study on openstack and eclipse. In: Proceedings of the 23rd International Conference on Software Analysis, Evolution, and Reengineering, SANER ’16, pp 189–200
Xu W, Huang L, Fox A, Patterson D, Jordan MI (2009) Detecting large-scale system problems by mining console logs. In: Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles, SOSP ’09, pp 117–132
Yuan D, Mai H, Xiong W, Tan L, Zhou Y, Pasupathy S (2010) Sherlog: Error diagnosis by connecting clues from run-time logs. In: Proceedings of the 15th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS, vol XV, pp 143–154
Yuan D, Zheng J, Park S, Zhou Y, Savage S (2011) Improving software diagnosability via log enhancement. In: Proceedings of the 16th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS, vol XVI, pp 3–14
Yuan D, Park S, Huang P, Liu Y, Lee MM, Tang X, Zhou Y, Savage S (2012a) Be conservative: Enhancing failure diagnosis with proactive logging. In: Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation, OSDI ’12, vol 12, pp 293–306
Yuan D, Park S, Zhou Y (2012b) Characterizing logging practices in open-source software. In: Proceedings of the 34th International Conference on Software Engineering, ICSE ’12, pp 102–112
Zhang S, Cohen I, Symons J, Fox A (2005) Ensembles of models for automated diagnosis of system performance problems. In: Proceedings of the 2005 International Conference on Dependable Systems and Networks, DSN ’05, pp 644–653
Zhu J, He P, Fu Q, Zhang H, Lyu MR, Zhang D (2015) Learning to log: Helping developers make informed logging decisions. In: Proceedings of the 37th International Conference on Software Engineering - Volume 1, ICSE ’15, pp 415–425
Zimmermann T, Weisgerber P, Diehl S, Zeller A (2004) Mining version histories to guide software changes. In: Proceedings of the 26th International Conference on Software Engineering, ICSE ’04, pp 563–572
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Arie van Deursen
Rights and permissions
About this article
Cite this article
Li, H., Shang, W., Zou, Y. et al. Towards just-in-time suggestions for log changes. Empir Software Eng 22, 1831–1865 (2017). https://doi.org/10.1007/s10664-016-9467-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-016-9467-z