Apriori Sets And Sequences : ASAS Home : Documents : References : Temporal : Discovering Frequent Event Patterns with Multiple Granularities in Time Sequences-Notes

Temporal

[ Intro ] [ Discovering Calendar-based Temporal Association Rules.pdf ] [ Discovering Frequent Event Patterns with Multiple Granularities in Time Sequences-Notes ] [ Discovering Frequent Event Patterns with Multiple Granularities in Time Sequences.pdf ]

[ Up: References ]

Notes on:

      "Discovering Frequent Event Patterns with Multiple Granularities
      in Time Sequences",
      Claudio Bettini, X. Sean Wang, Sushil Jajodia, Jai-Ling Lin,
      IEEE Transactions on Knowledge and Data Engineering, Vol. 10,
      No. 2, March/April 1998.

Read 2003-11-05
      
Wang and Jajodia also authored  Discovering Calendar-based Temporal
Association Rules.

Diagrams and some discussion of this work can be seen in the
presentation Related located in the Document - Presentation section of
this web site.

What follows are notes taken from the paper.

"temporal patterns or relationships that involve multiple
granularities" By multiple granularities the authors mean that a
single sequence of events occurring in time using some measure, such as
seconds can be mapped to another measure such as minutes. The multiple
granularities define a view of the events through different measures
or units of time.

The authors define an "event structure" which is a template for
patterns of events. This template defines the temporal relationship
between events in the pattern, what the first occurring event is and
possibly what other events are. This structure contains variables for
the unspecified event types that may be used to satisfy the event
structure during mining.

An example from the paper:

   "For example, we can constrain two events to occur between four and
   six hours after the first but within the same business day."

The event structure is a user specified parameter to their algorithm.
The event structure deals with the time interval between events. The
events are instantaneous.

Candidate instantiation is the substitution of actual event types for
the event variables in an event structure.

An "event type" is defined simply as a descriptive string. 
An event type in ASAS may be comprised of two parts. The first is the
original sequence being referenced by the event type. The second is
the kind of event detected in the sequence. Events defined as simple a
descriptive string may also be used in ASAS.

An event is a pairing of an event type and a positive integer designated as a
time stamp. In ASAS and event is a paring of event type and an interval
defined by a begin and end time which may take any numeric form.

Confidence of a frequent pattern is defined as the percentage of
occurrences of the reference event, the first occurring event in the
event structure specified by the user, which are followed by the other
events specified in the candidate instantiation for a specific
pattern. This is similar to confidence in ASAS where the percentage of
antecedent occurrences forms the base of the percentage.

In the method for finding frequent patterns there are a number of
heuristics in use to prune the number of candidate instantiations that
have to be explicitly counted in the data set. Most of these pruning
heuristics are based on:

   "the observation that if a discovery problem has a solution, then
   part of this solution is a solution for a 'subproblem' of the
   considered one."

This strikes me as the same as the Apriori principle although the
authors do not mention it.

For the fourth heuristic the authors do say that the basic idea:

   "If a complex event type occurs frequently, then any of its subtype
   should also occur frequently."

is similar to "Discovering Frequent Episodes in Sequences" by Heikki
Mannila, Hannu Toivonen, and A. Inkeri Verkamo. This also strikes me
the same as the Apriori principle.

Note that the data set in this work is a single sequence of
events. This would be a single instance for ASAS. The example data set
described in this paper could be translated. In ASAS data sets where
instances that are not contiguous as in computer performance domain
this does not work. A translation to the author's system could not be made. I
think this algorithm could be modified to handle multiple instances though.

The authors note that a run time of five hours is usual. Considering
the input is a single instance and the machine the experiments were
run on was an Alpha Server with three 250 MHz processors and 2 GB of
ram this time is not impressive even by today's standards.

by: Keith A. Pray
Last Modified: July 4, 2004 7:20 AM