Keith A. Pray - Professional and Academic Site
About Me
·
·
·
LinkedIn Profile Facebook Profile GoodReads Profile
Professional
Academic
Teaching
                                          
Printer Friendly Version
Report

Intro ] [ codedescription ] [ summary ]

Experiments ] [ Initial Experiments ] [ Slides ] [ Up: Naive Bayes ]

Summary of Results

      The most accurate Bayes Classifier built so far, as measured by the test data, was the Discretized Data. The Bayes Classifier and results can be found in that link. (See attached sheet if this is a hard copy.) The accuracy achieved with the resulting classifier was 84.27%.

      This is the second best overall result when compared to the other learning methods explored. The most successful being the Decision Tree using C4.5 (which should have been ID3, so actually, this is the best out of the planned learning techniques so far) from Project 2.

      Compared to the Neural Network method used in Project 3, the Naive Bayes appears much better. It has a significantly shorter training time and produced better initial results. I suspect that given time to tune the Neural Network implementation for the Census-Income application, it could out perform the Naive Bayes Classifier, but with the expense of significantly longer training times. Another thing to consider is the arbitrary nature of neural networks using back propagation. Depending on where in the search space one begins, the resulting classifier can vary a great deal in accuracy. The Naive Bayes algorithm on the other hand will be consistent using the same training data.

      One of the main weaknesses of this system is the lack of parameters to tune this method for different applications. This is inherent in the simple nature of the algorithm, which is mainly assumptions and counting.

      This leads to playing with different methods of pre-processing the training and testing data sets. Discretization seemed to help the most. This is most likely due to, again, the counting nature of the system. With more attributes to count values over, the more information the system was able to glean from the training data. My method of using the standard deviation and mean for handling continuous values might have caused problems as well. Since the meaningful grouping of ranges for the continuous attributes in this application are not very regular but are significant (like age), it might not be properly represented by summarizing it into mean and std.

      A very nice feature of this system is the speed at which training is done. While this system is one of the fastest to train, it still generates very good results in comparison to the others explored. The speed is due to the low computational complexity of counting actual examples and deriving a ratio for the values observed.

      Things to try:

  • Decretizing techniques
  • Other continuous attribute handling
  • Other ways to handle missing attributes


by: Keith A. Pray
Last Modified: July 4, 2004 8:59 AM
© 2004 - 1975 Keith A. Pray.
All rights reserved.

Current Theme: 

Kapowee Hosted | Kapow Generated in 0.008 second | XHTML | CSS