Keith A. Pray
Department of Computer Science
Worcester Polytechnic Institute
Worcester, MA 01609 USA
Advisor: Professor Carolina Ruiz
Reader: Professor Matt Ward
We introduce an algorithm for mining expressive temporal relationships from
complicated, temporal data. The algorithm, Apriori Sets And Sequences, extends
Apriori, first introduced in [AS94]. It takes as input a data
set in which a single instance may contain many attributes with
values that are sequences of numeric or symbolic
literals. Furthermore, each value in a sequence occurs at a specific
time relative to the instance. Each time sequence attribute in a
single instance shares the same time line. These attributes are in
addition to traditional attributes with single values. Apriori Sets
And Sequences produces sets of events and normal attribute value pairs
that occur frequently in the data set. From these association rules
are built that meet the specified confidence.
The data sets described occur naturally in many domains including
health care, stock market analysis, complex system diagnostics, and
computer system performance. These are domains in which data is
collected or observed over time. The values collected can constitute
events, things that happen over a specific period of time. These events
can relate to each other in any of thirteen ways
(cite please cite me). Apriori Sets And Sequences produces rules that
express these temporal relationships that describes the activity
observed in the data set.