Traditionally confidence is defined for a
rule A
B as the percentage of instances that contain A
that also contain B. This is usually calculated as S(AB) / S(A), where
S is support.
Assume a data set that has one time sequence in it; {a:b:a:a:a:a}.
Consider a rule A
B where both A and B contain one event
item each, a(0:1) and b(2:3) respectively. The event item a begins at
time 0 and ends at time 1. The event item b begins at time 2 and ends
at time 3. Since there is one instance in our data set and it contains
the itemset {A, B} as described, the support of the itemsets {A}, {B},
and {A, B} are 1. If support was used to calculate the
confidence of the rule A
B it would be 1. This implies
that in the data set from which he rule was mined that 100 percent of
the time a appears, b follows. Looking at the time sequence only 20
percent of the time is a followed by b.
Instead of using support use event weight. Confidence of a rule
containing event items in both the antecedent and consequent of the
rule is defined here as EV(AB) / EV(A), where EV is event weight. The
event weight of the itemset {A} is 5, {B} is 1, and {A, B} is 1. The
confidence for A
B is 0.2, or 20 percent. Clearly this more
accurately represents the data set.