An Online Algorithm for Segmenting Time Series
by Eamonn Keogh, Selina Chu, David Hart, and Michael Pazzani
ABSTRACT: In recent years, there has been an explosion of interest in
mining time series databases. As with most computer science problems, representation
of the data is the key to efficient and effective solutions. One of the
most commonly used representations is piecewise linear approximation. This
representation has been used by various researchers to support clustering,
classification, indexing and association rule mining of time series
data. A variety of algorithms have been proposed to obtain this
representation, with several algorithms having been independently
rediscovered several times. In this paper, we undertake the first
extensive review and empirical comparison of all proposed techniques. We
show that all these algorithms have fatal flaws from a data mining
perspective. We introduce a novel algorithm that we empirically show to be
superior to all others in the literature.
Keywords: Time series, dimensionality reduction, segmentation, data mining.
(available in ps
and pdf formats)
Back
The University of Southern California does not screen or control the content on this website and thus does not guarantee the accuracy, integrity, or quality of such content. All content on this website is provided by and is the sole responsibility of the person from which such content originated, and such content does not necessarily reflect the opinions of the University administration or the Board of Trustees