|
|
Project 1: Concept Learning, Ordering
Exercise 2.2
-
Give the sequence of S and G boundary sets computed
by the
Candidate-Elimination algorithm if it is given
the sequence of training examples from Table 2.1
in reverse order.
Table 2.1: Training Examples
Example
|
Sky
|
AirTemp
|
Humidity
|
Wind
|
Water
|
Forecast
|
EnjoySport
|
1
|
Sunny
|
Warm
|
Normal
|
Strong
|
Warm
|
Same
|
Yes
|
2
|
Sunny
|
Warm
|
High
|
Strong
|
Warm
|
Same
|
Yes
|
3
|
Rainy
|
Cold
|
High
|
Strong
|
Warm
|
Change
|
No
|
4
|
Sunny
|
Warm
|
High
|
Strong
|
Cool
|
Change
|
Yes
|
Initial State
Training Example #4
S1:
{< Sunny, Warm, High, Strong, Cool, Change >}
|
Training Example #3
S2:
{< Sunny, Warm, High, Strong, Cool, Change >}
|
G2:
{< Sunny, ?, ?, ?, ?, ? >
< ?, Warm, ?, ?, ?, ? >
< ?, ?, ?, ?, Cool, ? >}
|
Training Example #2
S3:
{< Sunny, Warm, High, Strong, ?, ? >}
|
G3:
{< Sunny, ?, ?, ?, ?, ? >
< ?, Warm, ?, ?, ?, ? >}
|
Training Example #1
S4:
{< Sunny, Warm, ?, Strong, ?, ? >}
|
G4:
{< Sunny, ?, ?, ?, ?, ? >
< ?, Warm, ?, ?, ?, ? >}
|
-
Although the final version space will be the same regardless of
the sequence of examples (why?), the sets S and G
computed at intermediate stages will, of course, depend on this
sequence.
The final version space is the same because it contains all the
hypotheses consistent with the set of training examples used.
Since there is only one set of hypotheses which are consistent
with any given set of training examples, the algorithm must arrive
at that one set regardless of the order.
-
Can you come up with ideas for ordering the training examples
to minimize the sum of the sizes of these intermediate S
and G sets for the H used in the EnjoySport
example.
The sum for the current order of training examples above is 14,
counting the initial state.
The minimum this sum can be for any set of training examples is:
2 * ( # examples + 1 )
For our example set size, the minimum would be 10.
Since the hypothesis representation in use can only generalize an
attribute with ?, there can be only one hypothesis in
S at any one time. Hypotheses never get added to S
since the specific hypothesis space relies on conjunctive expressions.
In other words, generalizing S cannot be done by using an
or condition. Consider an attribute A with
possible values b, c, and d. If b and c
were consistent with all positive training examples but d
was not,
the hypothesis in S must represent A as ?
instead of having two hypotheses, one specifying b and
the other c. This is the natural bias of the representation
which allows classification of instances not previously encountered.
This leads to considering how to minimize G. The only time
hypotheses are added to G is when a negative example is
presented, as stated in the algorithm 1.
Therefore, a general strategy would be to present all positive
training examples before negative training examples.
Following this strategy, we arrive at a sum of 11. 11 is the minimum
for this set of training examples since in the final state G
contains 2 hypotheses, one more than the absolute minimum calculated
above.
1 Table 2.5, pp. 33, Machine Learning,
Tom M. Mitchell
|
|