Inferring extended finite state machine models from software executions

Walkinshaw, Neil; Taylor, Ramsay; Derrick, John

doi:10.1007/s10664-015-9367-7

Inferring extended finite state machine models from software executions

Published: 17 March 2015

Volume 21, pages 811–853, (2016)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

Neil Walkinshaw¹,
Ramsay Taylor² &
John Derrick²

2575 Accesses
81 Citations
5 Altmetric
Explore all metrics

Abstract

The ability to reverse-engineer models of software behaviour is valuable for a wide range of software maintenance, validation and verification tasks. Current reverse-engineering techniques focus either on control-specific behaviour (e.g., in the form of Finite State Machines), or data-specific behaviour (e.g., as pre / post-conditions or invariants). However, typical software behaviour is usually a product of the two; models must combine both aspects to fully represent the software’s operation. Extended Finite State Machines (EFSMs) provide such a model. Although attempts have been made to infer EFSMs, these have been problematic. The models inferred by these techniques can be non-deterministic, the inference algorithms can be inflexible, and only applicable to traces with specific characteristics. This paper presents a novel EFSM inference technique that addresses the problems of inflexibility and non-determinism. It also adapts an experimental technique from the field of Machine Learning to evaluate EFSM inference techniques, and applies it to three diverse software systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Active Inference of EFSMs Without Reset

Extending Automata Learning to Extended Finite State Machines

Incorporating Data into EFSM Inference

Notes

https://bitbucket.org/nwalkinshaw/efsminferencetool
Here, k refers to the parameter that determines when we merge states, not to k-folds.
http://www.cs.le.ac.uk/people/nwalkinshaw/efsm/
http://sourceforge.net/projects/columba/
http://www.project-voldemort.com/voldemort/
http://www.jedit.org/
Socket timed out for higher values of k, and is therefore not included in this particular aspect of the discussion.

References

Aarts F, Heidarian F, Kuppens H, Olsen P, Vaandrager F (2012) Automata learning through counterexample-guided abstraction refinement. In: In Proceedings FM 2012, 18th International Symposium on Formal Methods
Ammons G, Bodík R, Larus JR (2002) Mining specifications. In: POPL 2002, Portland, Oregon, pp 4–16
Androutsopoulos K, Gold N, Harman M, Li Z, Tratt L (2009) A theoretical and empirical study of EFSM dependence. In: 2009 IEEE International Conference on Software Maintenance, ICSM 2009. IEEE, pp 287–296
Angluin D (1987) Learning Regular Sets from Queries and Counterexamples. Inf Comput 75:87–106
Article MathSciNet MATH Google Scholar
Arts T, Earle CB, Derrick J (2004) Development of a verified Erlang program for resource locking. Int J Softw Tools Technol Transfer 5(2–3):205–220
Article Google Scholar
Biermann AW, Feldman JA (1972) On the synthesis of finite-state machines from samples of their behaviour. IEEE Trans Comput C 21:592–597
Article MathSciNet MATH Google Scholar
Börger E, Stärk RF (2003) Abstract State Machines: A Method for High-level System Design and Analysis. Springer
Lindig CVD, Wasylkowski A, Zeller A (2006) Mining object behavior with ADABU. In: Proceedings of the 2006 international workshop on Dynamic systems analysis. ACM, pp 17–24
Cesarini F, Thompson S (2011) Erlang by Example. O’Reilly Media
Cheng K, Krishnakumar A (1993) Automatic functional test generation using the extended finite state machine model. In: 30th Conference on Design Automation. ACM, pp 86–91
Clarke E, Grumberg O, Jha S, Lu Y, Veith H (2000) Counterexample-guided abstraction refinement. In: Computer aided verification. Springer, pp 154–169
Cook J, Wolf A (1998) Discovering models of software processes from event-based data. ACM Trans Softw Eng Methodol 7(3):215–249
Article Google Scholar
Dallmeier V, Knopp N, Mallon C, Fraser G, Hack S, Zeller A (2012) Automatically generating test cases for specification mining. IEEE Trans Softw Eng 38(2):243–257
Article Google Scholar
Damas C, Lambeau B, Dupont P, van Lamsweerde A (2005) Generating annotated behavior models from end-user scenarios. IEEE Trans Softw Eng 31(12)
Damm W, Harel D (2001) Lscs: Breathing life into message sequence charts. Formal Methods in System Design 19(1):45–80
Article MATH Google Scholar
De La Higuera C (2005) A bibliographical study of grammatical inference. Pattern Recog 38(9):1332–1348
Article Google Scholar
Ernst MD, Cockrell J, Griswold WG, Notkin D (2001) Dynamically discovering likely program invariants to support program evolution. IEEE Trans Softw Eng 27(2):1–25
Article Google Scholar
Fraser G, Walkinshaw N (2012) Behaviourally adequate software testing. In: Software Testing, Verification and Validation (ICST) 2012. IEEE, pp 300–309
Freund Y, Schapire R (1995) A desicion-theoretic generalization of on-line learning and an application to boosting. In: Computational learning theory. Springer, pp 23–37
Gold EM (1967) Language identification in the limit. Inf Control 10:447–474
Article MATH Google Scholar
Gransden T, Walkinshaw N, Raman R (2014) Mining State-Based Models from Proof Corpora. In: Proceedings of Conferences on Intelligence Mathematics - Mathematical Knowledge Management Track - CICM’14, vol 8543
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. SIGKDD Explor Newsl 11:10–18
Article Google Scholar
Hierons RM, Bogdanov K, Bowen JP, Cleaveland R, Derrick J, Dick J, Gheorghe M, Harman M, Kapoor K, Krause P et al (2009) Using formal specifications to support testing. ACM Comput Surv (CSUR) 41(2):9
Article Google Scholar
Holcombe M (1988) X-machines as a basis for dynamic system specification. Softw Eng J 3(2):69– 76
Article Google Scholar
Howar F, Steffen B, Jonsson B, Cassel S (2012) Inferring canonical register automata. In: Verification, Model Checking, and Abstract Interpretation. Springer, pp 251–266
Howden WE (1982) Weak mutation testing and completeness of test sets. IEEE Trans Softw Eng 4:371–379
Article Google Scholar
Just R, Schweiggert F, Kapfhammer GM (2011) MAJOR: An efficient and extensible tool for mutation analysis in a Java compiler. In: Automated Software Engineering (ASE). IEEE/ACM, pp 612–615
Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: International joint Conference on artificial intelligence, vol 14. Morgan Kaufmann Publishers Inc., pp 1137–1145
Kramer J, Magee J, Sloman M, Lister A (1983) Conic: an integrated approach to distributed computer control systems. IEE Proc 130(1):1–10
Article Google Scholar
Krka I, Brun Y, Medvidovic N (2014) Automatic mining of specifications from invocation traces and method invariants. In: ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE), Hong Kong, China
Lang KJ, Pearlmutter BA, Price RA (1998) Results of the Abbadingo One DFA learning competition and a new evidence-driven state merging algorithm. In: Honavar V, Slutzki G (eds) Proceedings of the 4th International Colloquium on Grammatical Inference, vol 1433. Springer-Verlag, pp 1–12
Lee C, Chen F, Roşu G (2011) Mining parametric specifications. In: Proceedings of the 33rd International Conference on Software Engineering. ACM, pp 591–600
Li H, Thompson S (2011) A User-extensible Refactoring Tool for Erlang Programs. Tech. rep., University of Kent, http://www.cs.kent.ac.uk/pubs/2011/3171
Lo D, Khoo SC (2006) QUARK: Empirical assessment of automaton-based specification miners. In: 2006 IEEE Computer Society on Reverse Engineering, (WCRE’06), pp 51–60
Lo D, Maoz S (2012) Scenario-based and value-based specification mining: better together. Autom Softw Eng 19(4):423–458
Article Google Scholar
Lo D, Cheng H, Han J, Khoo SC, Sun C (2009) Classification of software behaviors for failure detection: a discriminative pattern mining approach. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 557–566
Lo D, Mariani L, Santoro M (2012) Learning extended FSA from software: An empirical assessment. J Syst Softw 85(9):2063–2076. doi10.1016/j.jss.2012.04.001
Article Google Scholar
Lorenzoli D, Mariani L, Pezzè M (2008) Automatic generation of software behavioral models. In: 2008 ACM/IEEE 30th International Conference on Software Engineering, (ICSE’08). ACM, pp 501– 510
Mitchell T (1997) Machine Learning. McGraw-Hill
Quinlan JR (1993) C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo
Google Scholar
Sokolova M, Lapalme G (2009) A systematic analysis of performance measures for classification tasks. Inf Process Manag 45(4):427–437
Article Google Scholar
Taylor R, Hall M, Bogdanov K, Derrick J (2012) Using behaviour inference to optimise regression test sets. In: Testing Software and Systems (ICTSS’12). Springer, pp 184–199
Valdes A, Skinner K (2000) Adaptive, model-based monitoring for cyber attack detection. In: Recent Advances in Intrusion Detection. Springer, pp 80–93
Valiant L (1984) A theory of the learnable. Commun ACM 27(11):1134–1142
Article MATH Google Scholar
Walkinshaw N, Bogdanov K (2013) Automated comparison of state-based software models in terms of their language and structure. ACM Trans Softw Eng Methodol 22 (2)
Walkinshaw N, Bogdanov K, Holcombe M, Salahuddin S (2007) Reverse engineering state machines by interactive grammar inference. In: 2007 14th Working Conference on Reverse Engineering, WCRE 2007. IEEE, pp 209–218
Walkinshaw N, Bogdanov K, Ali S, Holcombe M (2008) Automated discovery of state transitions and their functions in source code. Software Testing. Verification and Reliability (STVR) 18(2):99– 121
Article Google Scholar
Walkinshaw N, Derrick J, Guo Q (2009) Iterative refinement of reverse-engineered models by model-based testing. In: International conference on Formal Methods (FM’09). Springer, pp 305–320
Walkinshaw N, Bogdanov K, Derrick J, Paris J (2010) Increasing functional coverage by inductive testing: A case study. In: Testing Software and Systems (ICTSS’10), pp 126–141
Walkinshaw N, Lambeau B, Damas C, Bogdanov K, Dupont P (2012) STAMINA: a competition to encourage the development and assessment of software model inference techniques. Empir Softw Eng:1–34
Walkinshaw N, Taylor R, Derrick J (2013) Inferring extended finite state machine models from software executions. In: 2013 20th Working Conference on Reverse Engineering (WCRE). IEEE, pp 301–310
Weiss SM, Kapouleas I (1989) An empirical comparison of pattern recognition, neural nets, and machine learning classification methods. In: Proceedings of the Eleventh International Joint Conference on Artificial Intelligence. Morgan Kaufmann, pp 781–787
Wolpert DH (1996) The lack of a priori distinctions between learning algorithms. Neural comput 8(7):1341–1390
Article Google Scholar

Download references

Acknowledgments

We thank Ivo Krka at the University of Southern California for providing us with some of the traces for the evaluation. Ramsay Taylor and John Derrick are supported by the EU FP7 PROWESS projects. Neil Walkinshaw is grateful for the support provided by the UK Ministry of Defence (MOD) through the BATS and HASTE projects. Information contained in this document should not be interpreted as representing the views of the MOD, nor should it be assumed that it reflects any current or future MOD policy. The information cannot supersede any statutory or contractual requirements or liabilities and is offered without prejudice or commitment.

Author information

Authors and Affiliations

Department of Computer Science, The University of Leicester, University Road, Leicester, LE1 7RH, UK
Neil Walkinshaw
Department of Computer Science, The University of Sheffield, 211 Portobello, Sheffield, S1 4DP, UK
Ramsay Taylor & John Derrick

Authors

Neil Walkinshaw
View author publications
You can also search for this author in PubMed Google Scholar
Ramsay Taylor
View author publications
You can also search for this author in PubMed Google Scholar
John Derrick
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Neil Walkinshaw.

Additional information

Communicated by: Romain Robbes, Massimiliano Di Penta and Rocco Oliveto

Appendices

Appendix A: Mine Pump Model Inferred by GK-Tails

Appendix B: Full results

Table 2

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Walkinshaw, N., Taylor, R. & Derrick, J. Inferring extended finite state machine models from software executions. Empir Software Eng 21, 811–853 (2016). https://doi.org/10.1007/s10664-015-9367-7

Download citation

Published: 17 March 2015
Issue Date: June 2016
DOI: https://doi.org/10.1007/s10664-015-9367-7

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Inferring extended finite state machine models from software executions

Abstract

Access this article

Subscribe and save

Buy Now