|
Project 1: Concept Learning, Ordering
Exercise 2.1
-
Explain why the size of the hypothesis space in the EnjoySport
learning task is 973.
The hypothesis space cardinality is the number of different
combinations of the possible values for each attribute represented
in thehypothesis space.
The values in the hypothesis space for each attribute
include the specific values that can be assigned in addition to the "?"
(any value is acceptable) symbol and "Ø" (no value is
acceptable) symbol.
Since the "Ø" symbol for any attribute always classifies an
instance as negative, we count all the hypotheses with this
value only once. Consider it the "never" hypothesis.
Attributes:
|
Sky
|
|
AirTemp
|
|
Humidity
|
|
Wind
|
|
Water
|
|
Forecast
|
|
Total
|
# Values:
|
3
|
*
|
2
|
*
|
2
|
*
|
2
|
*
|
2
|
*
|
2
|
=
|
96
|
+ 1 ("?")
|
4
|
*
|
3
|
*
|
3
|
*
|
3
|
*
|
3
|
*
|
3
|
=
|
972
|
+ 1 for the "never" hypothesis
|
|
+
|
1
|
Total:
|
|
|
973
|
-
How would the number of possible instances and possible hypotheses
increase with the addition of the attribute WaterCurrent,
which can take on the values Light, Moderate, or Strong?
Instances:
All that has to be done is to factor in 3 for values the new
attribute can assume in the calculation we did before.
32 * 25 = 288
instances.
Possible Hypotheses:
All that has to be done is to factor in 4 for values the new
attribute can assume plus the "?" symbol in the calculation we
did before.
42 * 35 = 3889
possible hypotheses.
-
More generally, how does the number of possible instances and
hypotheses grow with the addition of a new attribute A that
takes on k possible values?
Instances:
The new number of instances is simply:
current # * k
Possible Hypotheses:
This is a little more complicated, but not much.
( current # * ( k + 1 ) ) - k
The ( k + 1 ) takes into account the "?" symbol.
The ( - k ) is for the extra times the "never" hypothesis was
counted by multiplying by ( k + 1 ).
|
|