Keith A. Pray - Professional and Academic Site
About Me
·
·
·
LinkedIn Profile Facebook Profile GoodReads Profile
Professional
Academic
Teaching
                                          
Printer Friendly Version
Report

Intro ] [ codedescription ] [ summary ]

Experiments ] [ Up: Neural Networks ]

Summary of Results

      The most accurate neural network built so far, as measured by the test data, was the Default settings net. The network and results can be found in that link. (See attached sheet if this is a hard copy.) I was disappointed with an accuracy of only 76.96%. Only half of the planned tests have been completed, the rest running currently. An update to this report will be made if better results are obtained.

      This pales in comparison to the ~86% accuracy of the decision tree from Project 2. Unfortunately, since the input data was so obfuscated from the original data, it was difficult to see what the neural network considered the most significant inputs. It will be very interesting to see how these compare.

      One of the main weaknesses of this system is the inability to follow the logic derived for the neural network. Many hours were spent trying to verify that updates were taking place in the network, in lieu of the poor results compared to the decision tree. One issue was the nature of the data itself, which was largely nominal. Neural nets seem especially adapt at working with numeric data.

      To overcome the nature of the data problem, nominal to binary conversion was used. This in itself might have been a down fall of the system. This turned the 14 attribute ( + class attribute) data set into one with 104 attributes ( + class attribute ). This made the network much more time consuming to train and debug. without 500 or more iterations of the learning examples, very little change in classification behavior was observed. Curiously, the accuracy of the net did not increase much even when classification behavior did change.

When the error sums for the hidden layer and output layer was monitored, it stabilized only after a short drop in value. It could have been that due to the odd search space created by the majority of binary inputs (some values were kept numeric, such as age, etc.) made the training especially likely to fall into a local minimum. some preliminary results showed that increasing the momentum setting yielded better results, but time did not allow extensive testing in this regard. It could also have been that the values used for learning rate and momentum were too large or too small by orders of magnitude. In hindsight, due to the extremely low values of the majority of the input, this should have been explored further.

A very nice feature of this system is the ability to dynamically build a neural network from any data set presented in the ARFF format (described in Project 2 Report). This was the major limitation of the face training example presented in the book. Of course, for educational purposes, a simpler example was in order. The set of options the system supports were very useful in testing. The number of iterations, learning rate, random generator seed, number of hidden nodes and the momentum factor could all be specified on the command line.

One feature that was planned for implementation but time did not allow was an alternative to the nominal to binary conversion. It is possible to represent the values of nominal attributes by numerating the values. This approach was not taken at first since it was unclear how the arbitrary numeric representation of values would bias the network. It would have been interesting to compare the two approaches.


by: Keith A. Pray
Last Modified: July 4, 2004 8:59 AM
© 2004 - 1975 Keith A. Pray.
All rights reserved.

Current Theme: 

Kapowee Hosted | Kapow Generated in 0.008 second | XHTML | CSS