|
|
Project 1: Concept Learning, Ordering
Exercise 2.4
-
Consider the instance space consisting of integer points in
the x, y plane and the set of hypotheses H consisting
of rectangles.
More precisely, hypotheses are of the form
a <= x <= b,
c <= y <= d,
where a, b, c, and d can be any integers.
-
Consider the version space with respect to the set of
positive ( + ) and negative ( - ) training examples shown below.
What is the S boundary of the version space in this case?
Write out the hypotheses and draw them in on the diagram.
Please see attached sheet for drawing.
S:
{< 4 <= x <= 6,
3 <= y <= 5 >}
|
This assumes that a rectangle is at minimum 1 x 1.
This is also assuming that generalization or specification
is the decreasing or increasing in value of a, b, c,
or d.
-
What is the G boundary of this version space? Write out
the hypotheses and draw them in.
Please see attached sheet for drawing.
G:
{< 3 <= x <= 8,
2 <= y <= 7 > }
|
My answer was missing
{ < 2 <= x <= 8,
2 <= y <= 5 > }
-
Suppose the learner may now suggest a new x, y instance
and ask
the trainer for its classification.
Suggest a query guaranteed to reduce the size of the version space,
regardless of how the trainer classifies it. Suggest one that
will not.
The learner could request ( 7, 4 ) for classification. Actually,
any point in ( 4 <= x <= 7, y = 6 ) or
( x = 7, 3 <= y <= 6 )
will work. This is because the points along thesey two lines are
between the version space bounds indentified by
S and G.
Since S and G should converge upon one
hypothesis, one must generalize or specialize, respectively.
By selecting ( 5, 4 ), ( 4, 5 ), ( 5, 5 ), ( 6, 3 ) or ( 6, 4 )
reducing the space should be avoided.
Since these points are already included by S, there should
be no change in the space. This is a result of the bias imposed
by the hypothesis representation.
Note: This only works with my original answer in the previous
problem.
-
Now assume you are a teacher, attempting to teach a particular
target concept
(e.g., 3 <= x <= 5, 2 <= y <= 9 ).
What is the smallest number of training examples you can provide so
that the
Candidate-Elimination algorithm will perfectly
learn the target concept?
For the target concept to be learned exactly, G and S
must converge, hence negative and positive examples must be given.
One set of opposite points from the concept rectangle should be given
as positive examples to force S to include or generalize to
the rectangle concept
(e.g., ( 3, 9 ) and ( 5, 2 ) ).
The other set of opposite corners should be expanded horizontally
and vertically by 1 and be given as negative examples to restrict or
specialize G to exclude everything but the concept rectangle
(e.g., ( 3, 2 )->( 2, 1 ) and ( 5, 9 )->( 6, 10 ) ).
This all requires that 4 training examples be given.
|
|