
Project 5: Evaluating Hypotheses
Exercise 5.1
Suppose you test a hypothesis h and find that it commits
r = 300 errors on a sample S of n = 1000
randomly drawn test examples.

What is the standard deviation in
error _{s} ( h )?
error _{s} ( h )

= 
r / n


= 
300 / 1000


= 
0.3

The variance in this estimate arises completely from the
variance in r.
Because r is Binomially distributed

variance ( error _{s} ( h ) )

= 
np ( 1  p )

Since p is unknown, substitute estimate r / n


= 
1000 ( 0.3 )( 1  0.3 )


= 
210

standard deviation ( r )


= 
square root ( variance ( r ) )


= 
square root ( 210 )


= 
14.49

standard deviation ( error _{s} ( h ) )


= 
standard deviation ( r ) / n


= 
14.49 / 1000


= 
0.01449


How does this compare to the standard deviation in the example
at the end of Section 5.3.4?
This is much smaller than the standard deviation of 0.07 in the example
mentioned above.
Even though the error _{s} ( h ) was the same for both
this problem and this example, the standard deviations differ.
This is due to the smaller number of test examples
(40) used in the example while this problem (5.1) had 1000.

