Home » Talks and Events » 2007 » Initiative in Innovative Computing @ Harvard - Seminar Series - Prof. Eamonn Keogh

Prof. Eamonn Keogh, Department of Computer Science and Engineering University of California Riverside

Algorithms and Representations for Mining Massive Collections of Time Series and Shapes

February 14, 2007 - Cambridge - Massachusetts - USA map it


Official Website


This is one of the seminars organized by the Initiative in Innovative Computing @ Harvard.

Abstract

To date, the vast majority of research on time series and shape data mining has focused on similarity search and clustering. I believe that these problems should now be regarded as essentially solved. In particular, there are now fast exact techniques for searching and clustering patterns under both the Euclidean distance and Dynamic Time Warping, the two most useful distance measures. However, from a knowledge discovery viewpoint, there are much more interesting problems, the detection of previously unknown patterns and relationships in time series and shape databases. Two concrete examples are finding the most unusual objects (discord discovery) and finding repeated objects (motif discovery).

While there are many representations that can be used to solve these problems (i.e. wavelets, Fourier methods etc), in this talk I argue that solutions which are scalable to massive datasets will require symbolic representations. The talk will be illustrated with examples from anthropology, law enforcement, biology and mining of historical texts.

Some Pointers