Abstract
Subsequence matching is one of the most important issues in the field of data mining. The existing subsequence matching algorithms use windows of the fixed size to construct only one index. The algorithms have a problem that their performance gets worse as the difference between the query sequence length and the window size increases. In this paper, we propose a new subsequence matching method based on index interpolation, which is a technique that constructs the indexes for multiple window sizes and chooses an index most appropriate for a given query sequence for subsequence matching.We first examine the performance change due to the window size effect through preliminary experiments, and devise a cost function for subsequence matching that reflects the distribution of query sequence lengths in the view point of physical database design. Next, we propose a new subsequence matching method to improve search performance, and present an algorithm based on the cost function to construct the multiple indexes to maximize the performance. Finally, we verify the superiority of the proposed method through a series of experiments using the real and the synthetic data sequences.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agrawal, R., et al.: Efficient Similarity Search in Sequence DataBases. In: Proc. Int’l Conf. on Foundations of Data Organization and Algorithms (FODO), Chicago, Illinois, October 1993, pp. 69–84 (1993)
Agrawal, R., et al.: Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Database. In: Proc. Int’l Conf. on Very Large Data Bases (VLDB), Zurich, Switzerland, September 1995, pp. 490–501 (1995)
Chan, K.P., Fu, A.W.C.: Efficient Time Series Matching by Wavelets. In: Proc. Int’l Conf. on Data Engineering (ICDE), Sydney, Australia, March 1999, pp. 126–133. IEEE, Los Alamitos (1999)
Faloutsos, C., et al.: Fast Subsequence Matching in Time-series Databases. In: Proc. Int’l Conf. on Management of Data, ACM SIGMOD, Minneapolis, Minnesota, May 1994, pp. 419–429 (1994)
Loh, W.K., et al.: A Subsequence Matching Algorithm that Supports Normalization Transform in Time-Series Databases. Data Mining and Knowledge Discovery 9(1), 5–28 (2004)
Moon, Y.S., et al.: Duality-Based Subsequence Matching in Time-Series Databases. In: Proc. Int’l Conf. on Data Engineering (ICDE), pp. 263–272. IEEE, Heidelberg (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Koh, HG., Loh, WK., Kim, SW. (2005). An Efficient Subsequence Matching Method Based on Index Interpolation. In: Ali, M., Esposito, F. (eds) Innovations in Applied Artificial Intelligence. IEA/AIE 2005. Lecture Notes in Computer Science(), vol 3533. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11504894_66
Download citation
DOI: https://doi.org/10.1007/11504894_66
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26551-1
Online ISBN: 978-3-540-31893-4
eBook Packages: Computer ScienceComputer Science (R0)