Abstract
Traceability recovery captures trace links among different software artifacts (e.g., requirements and code) when two artifacts cover the same part of system functionalities. These trace links provide important support for developers in software maintenance and evolution tasks. Information Retrieval (IR) is now the mainstream technique for semi-automatic approaches to recover candidate trace links based on textual similarities among artifacts. The performance of IR-based traceability recovery is evaluated by the ranking of relevant traces in the generated lists of candidate links. Unfortunately, this performance is greatly hindered by the vocabulary mismatch problem between different software artifacts. To address this issue, a growing body of enhancing strategies based on user feedback is proposed to adjust the calculated IR values of candidate links after the user verifies part of these links. However, the improvement brought by this kind of strategies requires a large amount of user feedback, which could be infeasible in practice. In this paper, we propose to improve IR-based traceability recovery by propagating a small amount of user feedback through the closeness analysis on call and data dependencies in the code. Specifically, our approach first iteratively asks users to verify a small set of candidate links. The collected frugal feedback is then composed with the quantified functional similarity for each code dependency (called closeness) and the generated IR values to improve the ranking of unverified links. An empirical evaluation based on nine real-world systems with three mainstream IR models shows that our approach can outperform five baseline approaches by using only a small amount of user feedback.
Similar content being viewed by others
Notes
Java Virtual Machine Tool Interface, https://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.html
References
Abadi A, Nisenson M, Simionovici Y (2008) A traceability technique for specifications. In: Krikhaar RL, Lämmel R, Verhoef C (eds) The 16th IEEE international conference on program comprehension, ICPC 2008, Amsterdam, The Netherlands, June 10-13, 2008, IEEE Computer Society, pp 103–112
Ali N, Guéhéneuc Y, Antoniol G (2013) Trustrace: Mining software repositories to improve the accuracy of requirement traceability links. IEEE Trans Software Eng 39(5):725–741. https://doi.org/10.1109/TSE.2012.71
Ali N, Sharafi Z, Guéhéneuc Y, Antoniol G (2015) An empirical study on the importance of source code entities for requirements traceability. Empirical Software Engineering 20(2):442–478
Antoniol G, Canfora G, Casazza G, Lucia AD, Merlo E (2002) Recovering traceability links between code and documentation. IEEE Trans Software Eng 28(10):970–983
Antoniol G, Casazza G, Cimitile A (2000) Traceability recovery by modeling programmer behavior. In: Proceedings of the seventh working conference on reverse engineering, WCRE’00, Brisbane, Australia, November 23-25, 2000, IEEE Computer Society, pp 240–247
Baeza-Yates RA, Ribeiro-Neto BA (1999) Modern information retrieval. ACM Press / Addison-Wesley
Baezayates R, Ribeironeto B (2011) Modern information retrieval. Addison-Wesley Publishing CompanyUnited States
Binkley D (2007) Source code analysis: A road map. In: Future of software engineering (FOSE ’07). IEEE Computer Society, Los Alamitos, CA, USA, pp 104–119
Bravenboer M, Smaragdakis Y (2009) Strictly declarative specification of sophisticated points-to analyses. In: Arora S, Leavens GT (eds) Proceedings of the 24th Annual ACM SIGPLAN conference on object-oriented programming, systems, languages, and applications, OOPSLA 2009, October 25-29, 2009, Orlando, Florida, USA, ACM, pp 243–262. https://doi.org/10.1145/1640089.1640108
Burgstaller B, Egyed A (2010) Understanding where requirements are implemented. In: 26th IEEE International conference on software maintenance (ICSM 2010), September 12–18, 2010. Romania, IEEE Computer Society, Timisoara, pp 1–5
Cleland-Huang J (2013) Are requirements alive and kicking? IEEE Softw 30(3):13–15
Cleland-Huang J, Gotel O, Hayes JH, Mäder P, Zisman A (2014) Software traceability: trends and future directions. In: Herbsleb JD, Dwyer MB (eds) Proceedings of the on future of software engineering, FOSE 2014, Hyderabad, India, May 31 - June 7, 2014, ACM, pp 55–69
Cleland-Huang J, Settimi R, Duan C, Zou X (2005) Utilizing supporting evidence to improve dynamic requirements traceability. In: 13th IEEE international conference on requirements engineering (RE 2005), 29 August - 2 September 2005. France, IEEEComputer Society, Paris, pp 135–144
De Lucia A, Marcus A, Oliveto R, Poshyvanyk D (2012) Information retrieval methods for automated traceability recovery. In: Gotel O, Zisman A (eds) Cleland-Huang J. Software and systems traceability, Springer, pp 71–98
De Lucia A, Oliveto R, Sgueglia P (2006) Incremental approach and user feedbacks: a silver bullet for traceability recovery. In: 22nd IEEE international conference on software maintenance (ICSM 2006), 24–27 September 2006. Pennsylvania, USA, IEEE Computer Society, Philadelphia, pp 299–309
De Lucia A, Penta MD, Oliveto R, Panichella A, Panichella S (2011) Improving ir-based traceability recovery using smoothing filters. In: The 19th IEEE international conference on program comprehension, ICPC 2011, Kingston, ON, Canada, June 22-24, 2011, IEEE Computer Society, pp 21–30
Dit B, Revelle M, Poshyvanyk D (2013) Integrating information retrieval, execution and link analysis algorithms to improve feature location in software. Empirical Software Engineering 18(2):277–309
Eaddy M, Aho AV, Antoniol G, Guéhéneuc Y (2008) CERBERUS: tracing requirements to source code using information retrieval, dynamic analysis, and program analysis. In: Krikhaar RL, Lämmel R, Verhoef C (eds) The 16th IEEE international conference on program comprehension, ICPC 2008, Amsterdam, The Netherlands, June 10-13, 2008, IEEE Computer Society, pp 53–62, https://doi.org/10.1109/ICPC.2008.39
Egyed A, Graf F, Grünbacher P (2010) Effort and quality of recovering requirements-to-code traces: Two exploratory experiments. In: RE 2010, 18th IEEE international requirements engineering conference, Sydney, New South Wales, Australia, September 27 - October 1, 2010, IEEE Computer Society, pp 221–230
Gethers M, Oliveto R, Poshyvanyk D, Lucia AD (2011) On integrating orthogonal information retrieval methods to improve traceability recovery. In: IEEE 27th international conference on software maintenance, ICSM 2011, Williamsburg, VA, USA, September 25-30, 2011, IEEE Computer Society, pp 133–142
Guo J, Cheng J, Cleland-Huang J (2017) Semantically enhanced software traceability using deep learning techniques. In: Uchitel S, Orso A, Robillard MP (eds) Proceedings of the 39th international conference on software engineering, ICSE 2017, Buenos Aires, Argentina, May 20-28, 2017, IEEE / ACM, pp 3–14
Hayes JH, Dekhtyar A, Sundaram SK (2006) Advancing candidate link generation for requirements tracing: The study of methods. IEEE Trans Software Eng 32(1):4–19
Khatiwada S, Tushev M, Mahmoud A (2020) On combining IR methods to improve bug localization. In: ICPC ’20: 28th International conference on program comprehension, Seoul, Republic of Korea, July 13-15, 2020, ACM, pp 252–262. https://doi.org/10.1145/3387904.3389280
Kuang H, Mäder P, Hu H, Ghabi A, Huang L, Lü J, Egyed A (2015) Can method data dependencies support the assessment of traceability between requirements and source code? J Softw Evol Process 27(11):838–866
Kuang H, Gao H, Hu H, Ma X, Lu J, Mäder P, Egyed A (2019) Using frugal user feedback with closeness analysis on code to improve ir-based traceability recovery. In: Guéhéneuc Y, Khomh F, Sarro F (eds) Proceedings of the 27th international conference on program comprehension, ICPC 2019, Montreal, QC, Canada, May 25-31, 2019, IEEE / ACM, pp 369–379
Kuang H, Nie J, Hu H, Lü J (2016) Improving automatic identification of outdated requirements by using closeness analysis based on source code changes. In: Zhang L, Xu C (eds) Software engineering and methodology for emerging domains. Springer Singapore, Singapore, pp 52–67
Kuang H, Nie J, Hu H, Rempel P, Lu J, Egyed A, Mäder P (2017) Analyzing closeness of code dependencies for improving ir-based traceability recovery. In: Pinzger M, Bavota G, Marcus A (eds) IEEE 24th international conference on software analysis, evolution and reengineering, SANER 2017, Klagenfurt, Austria, February 20-24, 2017, IEEE Computer Society, pp 68–78
Li Y, Tan T, Møller A, Smaragdakis Y (2020) A principled approach to selective context sensitivity for pointer analysis. ACM Trans Program Lang Syst 42(2):10:1-10:40. https://doi.org/10.1145/3381915
Lin Y, Meng G, Xue Y, Xing Z, Sun J, Peng X, Liu Y, Zhao W, Dong JS (2017a) Mining implicit design templates for actionable code reuse. In: Rosu G, Penta MD, Nguyen TN (eds) Proceedings of the 32nd IEEE/ACM international conference on automated software engineering, ASE 2017, Urbana, IL, USA, October 30 - November 03, 2017, IEEE Computer Society, pp 394–404
Lin Y, Sun J, Xue Y, Liu Y, Dong JS (2017b) Feedback-based debugging. In: Uchitel S, Orso A, Robillard MP (eds) Proceedings of the 39th international conference on software engineering, ICSE 2017, Buenos Aires, Argentina, May 20-28, 2017, IEEE / ACM, pp 393–403
Livshits B, Sridharan M, Smaragdakis Y, Lhoták O, Amaral JN, Chang BE, Guyer SZ, Khedker UP, Møller A, Vardoulakis D (2015) In defense of soundiness: a manifesto. Commun ACM 58(2):44–46. https://doi.org/10.1145/2644805
Li X, Zhu S, d’Amorim M, Orso A (2018) Enlightened debugging. In: Chaudron M, Crnkovic I, Chechik M, Harman M (eds) Proceedings of the 40th international conference on software engineering, ICSE 2018, Gothenburg, Sweden, May 27 - June 03, 2018, ACM, pp 82–92
Lohar S, Amornborvornwong S, Zisman A, Cleland-Huang J (2013) Improving trace accuracy through data-driven configuration and composition of tracing features. In: Proceedings of the 2013 9th joint meeting on foundations of software engineering, Association for Computing Machinery, New York, NY, USA, p 378–388. https://doi.org/10.1145/2491411.2491432
Macbeth G, Razumiejczyk E, Ledesma RD (2011) Cliff’s delta calculator: A non-parametric effect size program for two groups of observations. Universitas Psychologica 10(2):545–555
Mäder P, Egyed A (2012) Assessing the effect of requirements traceability for software maintenance. In: 28th IEEE international conference on software maintenance, ICSM 2012, Trento, Italy, September 23-28, 2012, IEEE Computer Society, pp 171–180
Marcus A, Maletic JI (2003) Recovering documentation-to-source-code traceability links using latent semantic indexing. In: Clarke LA, Dillon L, Tichy WF (eds) Proceedings of the 25th international conference on software engineering, May 3-10, 2003, Portland, Oregon, USA, IEEE Computer Society, pp 125–137
McMillan C, Poshyvanyk D, Revelle M (2009) Combining textual and structural analysis of software artifacts for traceability link recovery. In: Antoniol G, Poshyvanyk D, Oliveto R (eds) ICSE workshop on traceability in emerging forms of software engineering, TEFSE@ICSE 2009. Vancouver, BC, Canada, 18 May, 2009, IEEE Computer Society, pp 41–48
Palomba F, Salza P, Ciurumelea A, Panichella S, Gall HC, Ferrucci F, Lucia AD (2017) Recommending and localizing change requests for mobile apps based on user reviews. In: Uchitel S, Orso A, Robillard MP (eds) Proceedings of the 39th International conference on software engineering, ICSE 2017, Buenos Aires, Argentina, May 20-28, 2017, IEEE / ACM, pp 106–117
Panichella A (2021) A systematic comparison of search-based approaches for lda hyperparameter tuning. Information and Software Technology 130:106411. https://doi.org/10.1016/j.infsof.2020.106411
Panichella A, Dit B, Oliveto R, Di Penta M, Poshynanyk D, De Lucia A (2013) How to effectively use topic models for software engineering tasks? an approach based on genetic algorithms. In: 2013 35th International conference on software engineering (ICSE), pp 522–531. https://doi.org/10.1109/ICSE.2013.6606598
Panichella A, Dit B, Oliveto R, Di Penta M, Poshyvanyk D, De Lucia A (2016) Parameterizing and assembling ir-based solutions for se tasks using genetic algorithms. In: 2016 IEEE 23rd international conference on software analysis, evolution, and reengineering (SANER), vol 1, pp 314–325. https://doi.org/10.1109/SANER.2016.97
Panichella A, Lucia AD, Zaidman A (2015) Adaptive user feedback for ir-based traceability recovery. In: Mäder P, Oliveto R (eds) 8th IEEE/ACM International symposium on software and systems traceability, SST 2015, Florence, Italy, May 17, 2015, IEEE Computer Society, pp 15–21
Panichella A, McMillan C, Moritz E, Palmieri D, Oliveto R, Poshyvanyk D, Lucia AD (2013) When and how using structural information to improve ir-based traceability recovery. In: Cleve A, Ricca F, Cerioli M (eds) 17th European conference on software maintenance and reengineering, CSMR 2013, Genova, Italy, March 5-8, 2013, IEEE Computer Society, pp 199–208
Penta MD, Gradara S, Antoniol G (2002) Traceability recovery in RAD software systems. In: 10th International workshop on program comprehension (IWPC 2002), 27–29 June 2002. France, IEEE Computer Society, Paris, pp 207–216
Porter MF (1980) An algorithm for suffix stripping. Program 14(3):130–137. https://doi.org/10.1108/eb046814
Poshyvanyk D, Guéhéneuc Y, Marcus A, Antoniol G, Rajlich V (2007) Feature location using probabilistic ranking of methods based on execution scenarios and information retrieval. IEEE Trans Software Eng 33(6):420–432
Rath M, Rempel P, Mäder P (2017) The ilmseven dataset. In: Moreira A, Araújo J, Hayes J, Paech B (eds) 25th IEEE international requirements engineering conference, RE 2017, Lisbon, Portugal, September 4-8, 2017, IEEE Computer Society, pp 516–519
Rath M, Rendall J, Guo JLC, Cleland-Huang J, Mäder P (2019) Traceability in the wild: Automatically augmenting incomplete trace links. In: Becker S, Bogicevic I, Herzwurm G, Wagner S (eds) Software engineering and software management, SE/SWM 2019, Stuttgart, Germany, February 18-22, 2019, GI, LNI, vol P-292, p 63
Rempel P, Mäder P (2017) Preventing defects: The impact of requirements traceability completeness on software quality. IEEE Trans Software Eng 43(8):777–797
Rocchio JJ (1971) Relevance feedback in information retrieval. In: Salton G (ed) The Smart retrieval system - experiments in automatic document processing. Prentice-Hall, Englewood Cliffs, NJ, pp 313–323
Salton G, Buckley C (1990) Improving retrieval performance by relevance feedback. J Am Soc Inf Sci 41(4):288–297
Sharif B, Meinken J, Shaffer T, Kagdi HH (2017) Eye movements in software traceability link recovery. Empir Softw Eng 22(3):1063–1102. https://doi.org/10.1007/s10664-016-9486-9
Sui L, Dietrich J, Tahir A, Fourtounis G (2020) On the recall of static call graph construction in practice. In: Rothermel G, Bae D (eds) ICSE ’20: 42nd international conference on software engineering, Seoul, South Korea, 27 June - 19 July, 2020, ACM, pp 1049–1060. https://doi.org/10.1145/3377811.3380441
Walters B, Shaffer T, Sharif B, Kagdi HH (2014) Capturing software traceability links from developers’ eye gazes. In: Roy CK, Begel A, Moonen L (eds) 22nd International conference on program comprehension, ICPC 2014, Hyderabad, India, June 2-3, 2014, ACM, pp 201–204. https://doi.org/10.1145/2597008.2597795
Wilcoxon F (1944) Individual comparisons by ranking methods. Biom Bull Biometrics 1(6):80–83
Wohlrab R, Knauss E, Steghöfer J, Maro S, Anjorin A, Pelliccione P (2020) Collaborative traceability management: a multiple case study from the perspectives of organization, process, and culture. Requir Eng 25(1):21–45
Zyrianov V, Newman CD, Guarnera DT, Collard ML, Maletic JI (2019) srcptr: a framework for implementing static pointer analysis approaches. In: Guéhéneuc Y, Khomh F, Sarro F (eds) Proceedings of the 27th international conference on program comprehension, ICPC 2019, Montreal, QC, Canada, May 25-31, 2019, IEEE / ACM, pp 144–147. https://doi.org/10.1109/ICPC.2019.00031
Acknowledgements
This work is funded by the National Natural Science Foundation of China (Grant Nos. 61690204 and 61802173), the general program of the State Key Laboratory for Novel Software Technology (Grant Nos. ZZKT2021B05), the Collaborative Innovation Center of Novel Software Technology and Industrialization, the German Ministry of Education and Research (BMBF) grant: 01IS16003B and by DFG grant: MA 5030/3-1, and funded by the Austrian Science Fund (FWF), grand no. P31989, and by the Austrian COMET K1-Centre Pro2Future of the Austrian Research Promotion Agency (FFG) with funding from the Austrian ministries BMVIT and BMDW, and the Province of Upper Austria.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors have no competing interests to declare that are relevant to the content of this article.
Additional information
Communicated by: Federica Sarro and Foutse Khomh
This article belongs to the Topical Collection: International Conference on Program Comprehension (ICPC)
Rights and permissions
About this article
Cite this article
Gao, H., Kuang, H., Ma, X. et al. Propagating frugal user feedback through closeness of code dependencies to improve IR-based traceability recovery. Empir Software Eng 27, 41 (2022). https://doi.org/10.1007/s10664-021-10091-5
Accepted:
Published:
DOI: https://doi.org/10.1007/s10664-021-10091-5