Temporal

Intro ] [ Data Mining Introduction and Advanced Topics-Notes ] [ Detecting Complex Dependencies in Categorical Data-Notes ] [ Detecting Complex Dependencies in Categorical Data.pdf ]

Up: References ]

Notes on:

      "Detecting Complex Dependencies in Categorical Data",
      Tim Oates, Matthew D. Schmill, Dawn E. Gregory, Paul R. Cohen,
      University of Massachusetts, Amherst, MA, 1995

The goal of this work was to find dependencies between tokens in
multiple streams. We can interpret tokens as instantaneous events. A
stream is a sequence of events produced over time.

A dependency takes the form A => t B with some probability.
A is a set of tokens that happen in the streams at time i.
B is a set of tokens that happen in the streams at time i+t.

A and B have a fixed size and t is fixed during a single run of their
algorithm. 

The search algorithm used to find dependencies takes advantage of an
observation resembling the Apriori Principle. Dependencies that are
more specific have the same or less number of occurrences in the data
streams. The results are not deterministic, there is no guarantee to
find all dependencies.

A test domain in which the authors run their algorithm is that of a
boat shipping network simulation. A demon which changes schedules in
order to avoid bottlenecks at ports and such based on expert domain
knowledge is replaced with a demon that uses rules produced from
dependencies found from data produced in a previous run of the
simulator. The rules proved better at avoiding bottlenecks than the
original demon. This application is somewhat akin to the original
motivating problem of ASAS.

The authors allude to work being done on an incremental version of
their algorithm that can learn rules as new data is made
available. They are also considering ways to remove the fixed set size
and t value from the current algorithm.

In regards to ASAS, the data set here can be translated in a single
instance. ASAS produces more expressive rules. ASAS produces all the
association rules fitting the criteria specified by the user.
 

by: Keith A. Pray
Last Modified: July 4, 2004 7:20 AM
© 2004 - 1975 Keith A. Pray.
All rights reserved.