Project 5: Evaluating Hypotheses

Intro ] [ 5.1 ] [ 5.2 ] [ 5.3 ] [ 5.4 ] [ 5.5 ] [ 5.6 ]

Up: Machine Learning ]

Exercise 5.1

Suppose you test a hypothesis h and find that it commits r = 300 errors on a sample S of n = 1000 randomly drawn test examples.


  1. What is the standard deviation in error s ( h )?

    error s ( h ) = r / n
    = 300 / 1000
    = 0.3
    The variance in this estimate arises completely from the variance in r.
    Because r is Binomially distributed
    variance ( error s ( h ) ) = np ( 1 - p )
    Since p is unknown, substitute estimate r / n
    = 1000 ( 0.3 )( 1 - 0.3 )
    = 210
    standard deviation ( r )
    = square root ( variance ( r ) )
    = square root ( 210 )
    = 14.49
    standard deviation ( error s ( h ) )
    = standard deviation ( r ) / n
    = 14.49 / 1000
    =
    0.01449

  2. How does this compare to the standard deviation in the example at the end of Section 5.3.4?

    This is much smaller than the standard deviation of 0.07 in the example mentioned above. Even though the error s ( h ) was the same for both this problem and this example, the standard deviations differ. This is due to the smaller number of test examples (40) used in the example while this problem (5.1) had 1000.

 

by: Keith A. Pray
Last Modified: July 4, 2004 8:59 AM
© 2004 - 1975 Keith A. Pray.
All rights reserved.