Keith A. Pray : Home : Academic : Machine Learning : Naive Bayes : Report : Slides : methods

Slides

[ Intro ] [ methods ] [ results ] [ Best Classifier ]

[ Up: Report ]

Normalization :

All numeric attribute values result in being in the interval [0, 1].
The data sets, both training and test, were normalized at the same time so that the results would be consistent between the data sets.

Discretization :

The same method used for Project 2: Decision Trees was used here. The only difference being that continuous attributes could be split into more than two bins.

Missing Values :

These were simply skipped over for the purpose of generating the probabilities for the classifier. It is the equivelant of making each missing value contribute a factor of 1 to the probablity, in effect, changing nothing.
This same approach was used in classifing the test exmaples.

by: Keith A. Pray
Last Modified: January 10, 2007 3:22 PM

© 2007 - 1975 Keith A. Pray.
All rights reserved.