Keith A. Pray : Home : Academic : Machine Learning : Evaluating Hypotheses : 5.1

Keith A. Pray - Professional and Academic Site

About Me

Professional

Academic

Teaching

·	Social Implications Of Information Processing
·	Web Ware

Project 5: Evaluating Hypotheses

[ Intro ] [ 5.1 ] [ 5.2 ] [ 5.3 ] [ 5.4 ] [ 5.5 ] [ 5.6 ]

Exercise 5.1

Suppose you test a hypothesis h and find that it commits r = 300 errors on a sample S of n = 1000 randomly drawn test examples.

What is the standard deviation in error _s ( h )?

error _s ( h )	=	r / n
	=	300 / 1000
	=	0.3
The variance in this estimate arises completely from the variance in r. Because r is Binomially distributed
variance ( error _s ( h ) )	=	np ( 1 - p )
Since p is unknown, substitute estimate r / n
	=	1000 ( 0.3 )( 1 - 0.3 )
	=	210
standard deviation ( r )
	=	square root ( variance ( r ) )
	=	square root ( 210 )
	=	14.49
standard deviation ( error _s ( h ) )
	=	standard deviation ( r ) / n
	=	14.49 / 1000
	=	0.01449

How does this compare to the standard deviation in the example at the end of Section 5.3.4?

This is much smaller than the standard deviation of 0.07 in the example mentioned above. Even though the error _s ( h ) was the same for both this problem and this example, the standard deviations differ. This is due to the smaller number of test examples (40) used in the example while this problem (5.1) had 1000.

by: Keith A. Pray
Last Modified: July 4, 2004 8:59 AM

Kapowee Hosted | Kapow Generated in 0.007 second | XHTML | CSS