Abstract
Frequent episode discovery framework is a popular framework in temporal data mining with many applications. Over the years, many different notions of frequencies of episodes have been proposed along with different algorithms for episode discovery. In this paper, we present a unified view of all the apriori-based discovery methods for serial episodes under these different notions of frequencies. Specifically, we present a unified view of the various frequency counting algorithms. We propose a generic counting algorithm such that all current algorithms are special cases of it. This unified view allows one to gain insights into different frequencies, and we present quantitative relationships among different frequencies. Our unified view also helps in obtaining correctness proofs for various counting algorithms as we show here. It also aids in understanding and obtaining the anti-monotonicity properties satisfied by the various frequencies, the properties exploited by the candidate generation step of any apriori-based method. We also point out how our unified view of counting helps to consider generalization of the algorithm to count episodes with general partial orders.
Similar content being viewed by others
References
Achar A, Laxman S, Raajay V, Sastry PS (2009) Discovering general partial orders from event streams. Technical Report arXiv: 0902.1227v2 [cs.AI] http://arxiv.org, Dec 2009
Bouqata Bouchra, Caraothers Christopher D, Szymanski Boleslaw K, Zaki Mohammed J. (2006) Vogue: a novel variable order-gap state machine for modeling sequences. In: Proceedings of European conference principles and practice of knowledge discovery in databases (PKDD’06), pp 42–54, Sep 2006
Casas-Garriga G (2003) Discovering unbounded episodes in sequential data. In: Proceedings of European conference principles and practice of knowledge discovery in databases (PKDD’03), pp 83–94, Sep 2003
Das G, Fleischer R, Ga̧sieniec L, Gunopulos D, Kärkkäinen J (1997) Episode matching. In: Proceedings of annual symposium combinatorial pattern matching(CPM’97), pp 12–27, June–July 1997
Gwadera R, Atallah MJ, Szpankowski W (2003) Reliable detection of episodes in event sequences. In: Proceedings of IEEE international conference data mining (ICDM’03), pp 67–74, Nov 2003
Gwadera R, Atallah MJ, Szpankowski W (2005) Reliable detection of episodes in event sequences. Knowl Inf Syst 7(4): 415–437
Huang K-Y, Chang C-H (2008) Efficient mining of frequent episodes from complex sequences. Inf Syst 33(1): 96–114
Iwanuma K, Takano Y, Nabeshima H (2004) On anti-monotone frequency measures for extracting sequential patterns from a single very-long sequence. In: Proceedings of IEEE conference cybernetics and intelligent systems, pp 213–217, Dec 2004
Laxman S (2006) Discovering frequent episodes: fast algorithms, connections with HMMs and generalizations. PhD thesis, Bangalore, India, Sep 2006
Laxman S, Sastry PS, Unnikrishnan KP (2005) Discovering frequent episodes and learning hidden Markov models: a formal connection. IEEE Trans Knowl Data Eng 17(11): 1505–1517
Laxman S, Sastry PS, Unnikrishnan KP (2007) Discovering frequent generalized episodes when events persist for different durations. IEEE Trans Knowl Data Eng 19(9): 1188–1201
Laxman S, Sastry PS, Unnikrishnan KP (2007) A fast algorithm for finding frequent episodes in event streams. In: Proceedings of ACM SIGKDD international conference knowledge discovery and data mining (KDD’07), pp 410–419, Aug 2007
Laxman S, Tankasali V, White RW (2008) Stream prediction using a generative model based on frequent episodes in event sequences. In: Proceedings of ACM SIGKDD international conference knowledge discovery and data mining (KDD’08), pp 453–461, Jul 2008
Luo J, Bridges SM (2000) Mining fuzzy association rules and fuzzy frequent episodes for intrusion detection. Int J Intell Syst 15(8): 687–703
Karypis G, Joshi MV, Kumar V (1999) Universal formulation of sequential patterns. Technical Report TR 99-021, Department of Computer Science, University of Minnesota, Minneapolis
Mannila H, Toivonen H (1996) Discovering generalized episodes using minimal occurrences. In: Proceedings of international conference knowledge discovery and data mining (KDD’96), pp 146–151, Aug 1996
Mannila H, Toivonen H, Verkamo AI (1997) Discovery of frequent episodes in event sequences. Data Mining Knowl Discov 1(3): 259–289
Meger N, Rigotti C (2004) Constraint-based mining of episode rules and optimal window sizes. In: Proceedings of European conference principles and practice of knowledge discovery in databases (PKDD’04), Sep 2004
Morchen F (2007) Unsupervised pattern mining from symbolic temporal data. In: SIGKDD explorations, vol 9, pp 41–55, Jun 2007
Nag A, A Wai-Chee Fu (2003) Mining freqeunt episodes for relating financial events and stock trends. In: Proceedings of Pacific-Asia conference knowledge discovery and data mining, (PAKDD 2003), pp 27–39
Orlando S, Perego R, Silvestri C (2004) A new algorithm for gap constrained sequence mining. In: Proceedings of ACM symposium applied computing, pp 540–547, Mar 2004
Papapetrou P, Kollios G, Sclaroff S, Gunopulos D (2009) Mining frequent arrangements of temporal intervals. Knowl Inf Syst 21: 133–171
Patnaik D, Laxman S, Ramakrishnan N (2010) Discovering excitatory relationships using dynamic bayesian networks. Knowl Inf Syst, pp 1–31. doi:10.1007/s10115-010-0344-6
Patnaik D, Sastry PS, Unnikrishnan KP (2008) Inferring neuronal network connectivity from spike data: a temporal data mining approach. Sci Programm 16(1): 49–77
Sastry PS, Unnikrishnan KP (2010) Conditinal probability based significance tests for sequential patterns in multi-neuronal spike trains. Neural Comput 22(4): 1025–1059
Srikanth R, Agrawal R (1996) Mining sequential patterns: generalizations and performance improvements. In: Proceedings of international conference extending database technology (EDBT), Mar 1996
Tatti N (2008) Maximum entropy based significance of itemsets. Knowl Inf Syst 17(1): 57–77
Tatti N (2009) Significance of episodes based on minimal windows. In: Proceedings of IEEE international conference data mining (ICDM’09), Dec 2009
Unnikrishnan KP, Shadid BQ, Sastry PS, Laxman S (2009) Root cause diagnostics using temporal datamining. U.S.Patent no. 7509234, 24 Mar 2009
Wang M-F, Wu Y-C, Tsai M-F (2008) Exploiting frequent episodes in weighted suffix tree to improve intrusion detection system. In: Proceedings of international conference advanced information networking and applications(AINA’08), pp 1246–1252, Mar 2008
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Achar, A., Laxman, S. & Sastry, P.S. A unified view of the apriori-based algorithms for frequent episode discovery. Knowl Inf Syst 31, 223–250 (2012). https://doi.org/10.1007/s10115-011-0408-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-011-0408-2