next up previous contents
Next: Pattern Matching Up: Background Previous: Association Rules   Contents

The Apriori Algorithm

In order to motivate the development of an algorithm that successfully mines association rules from time sequence attributes, it is helpful to examine how Apriori works and why it is unsuited for this data set. Apriori was introduced by [AS94] with an application to market basket analysis, and with a transactional perspective of the input data.

The Apriori algorithm builds frequent item sets as sets of items (an item corresponds to an attribute-value pair in the case of tabular data). It takes the data set as input, along with minimum support and minimum confidence thresholds. As an intermediate step, it produces frequent item sets, and its final output is a list of all association rules whose confidence and support are above the minimum thresholds. An item set is frequent if its support is greater than some user-specified minimum support. An item set of size k is called a k-item set.

The Apriori algorithm is based on the Apriori principle, which states that every subset of a frequent item set is also frequent. It first calculates the support of every individual item by counting the instances that contain the item. Apriori then finds larger and larger frequent item sets, step by step. The general form of the Apriori algorithm is successive iterations of these steps:


next up previous contents
Next: Pattern Matching Up: Background Previous: Association Rules   Contents
Keith A. Pray 2003-06-17