Keith A. Pray - Professional and Academic Site | |||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Project 2 Report
[ Intro ]
[ C45PruneableClassifierTree.java ]
[ ClassifierTree.java ]
[ J48.java ]
[ code ]
[ codedescription ]
[ summary ]
Code Description Click to jump to a particular section of this page. The code used in this project is from the Weka project (www.cs.waikato.ac.nz/ml/weka/). Weka is a collection of machine learning algorithms for solving real-world data mining problems. It is written in Java and runs on almost any platform. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka is also well-suited for developing new machine learning schemes. Weka is open source software issued under the GNU public license.
Weka provides some basic learning functionality for educational
(among others reasons) purposes. One of these is ID3.
ID3 forms the basis of C4.5 which will be used in this project.
The source is provided for both algorithms, but for illustrative
purposes, ID3 will be examined here. Weka's implementation
of C4.5 (actually C4.5 Version 8) handles some issues with ID3.
Weka ID3 implements three major
methods,
The only adaption needed was to specify that pruning not be used.
This was possible through command line arguments.
The Weka package assumes data be in
The It is now time to construct the tree.
This is a straight forward implementation of the formula Gain ( S, A ) º Entropy ( S ) - å v Î Values ( A ) ( | Sv | / | S | ) Entropy ( Sv )
It calculates the entropy of the entire data set and then
uses the same method for splitting a data set as
The main differences between this ID3 algorithm and the one used in Weka's C4.5 is in the data model used to represent the training instances. The model decides exactly how each attribute is determined for the information gain function and how to split continuous value attributes. by: Keith A. Pray Last Modified: July 4, 2004 8:58 AM |
||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Kapowee Hosted | Kapow Generated in 0.008 second | XHTML | CSS