Thursday, February 19, 2009

Naive Bayesian Classifiers in Java - I

Introduction

The goals of this paper are first to give a short overview to naive bayesian classifiers and second to document an implementation of naive bayesian classifiers in the programing language Java using XML for the data input.
This paper and the associated application are subject of an seminar work for the course “ Intelligent Systems and Intelligent Agents” by Matjaz Gams at the Kaiserslautern University of Applied Sciences in march 2002.

Bayesian Learning

Other learning algorithms eliminate those hypotheses, which are not consistent to an training example. Whereas the Bayesian Learning just reduces the probability of an inconsistent hypothesis. This gives the Bayesian Learning a bigger flexibility. The Bayesian Learning Algorithms combine training data with a priori knowledge to get the a posteriori probability of an hypothesis. So it is possible to figure out the most probable hypothesis according to the training data. The basis for all Bayesian Learning Algorithms is the Bayes Rule.






P(h) = prior probability of hypothesis h
P(D) = prior probability of training data D
P(hD) = probability of h given D
P(Dh) = probability of D given h

Naive Bayes Algorithm

One of these algorithms is the Naive Bayes Algorithm, which uses Naive Bayesian Classifiers and which is subject of this seminar work.
Up to now we wondered which hypothesis is the most probable for a given dataset resp. a new instance.


But the question what Naive Bayesian Classifiers are about is which classification is the most probable for this new instance if we have a look at the training data.
For example an instance of an patient = (age,sex,weight) could be (45,male,120kg). An Naive Bayes System could calculate values for the following two classifications “cardiac insufficiency” and “no cardiac insufficiency” according to the available training data. Then the classifier with the biggest value rules the hypothesis. Whereby the conditional independence of the attributes of the instances is required for the use of Naive Bayesian Classifiers.

How does it look brought into a formula?

X be a set of instances xi = (a1,a2,…,an)
V be a set of classifications vj
Naive Bayes assumption:













This leads to the following algorithm:

Naive_Bayes_Learn (examples)
for each target value vj
estimate P(vj)
for each attribute value ai of each attribute a
estimate P(ai vj )

Classify_New_Instance (x)

In my implementation I chose an algorithm, that is a little different as you will see.


The Implementation

When I started to design this little application, I made three decisions. 1. the application should be useable for different kinds of training data, whereby only two targets (classifications) are allowed. 2. It should use a GUI for the communication with the user. 3. the application should be platform independent.
That’s why I chose Java as programming language for this project and XML for the representation of the training data.

The Java Classes

NaiveBayesApplication:
This class is responsible for the GUI and inherits from the javax.swing.JFrame class. It has the following important attributes:
- NaiveBayesLearner learner
- String domain (stores the name of the training data domain)
- Vector stringkeys (in this vector keywords were stored according to the instance requested by the user)
-
Thera are some other attributes which are only important for the GUI and the interaction with the user, but not for the algorithm.


The methods are:
- main() (instantiates an object of NaiveBayesApplication)
- loadTrainingData() (loads an XML-training data file with a DOMParser, calls the method getGUIPanelFromXML(), instantiates the attribut learner and calls its method learn() )
- getGUIPanelFromXML(Node,Node) (creates the GUI out of two Node objects according to the org.w3c.dom)
- calculate() (calls the calculate(stringkeys) method of the attribute learner with the vector stringkeys as parameter and actualizes the result on the screen from a Panel which is delivered by the called method)
- createActions() (creates AbstractActions objects, which are triggered when the user clicks the according buttons)

NaiveBayesLearner:
This class actually includes the algorithm. It has the following attributes:
- int numberofdatasets (used to store the number of training data examples)
- int numberofyes (used to store the number of training data examples which have positive result)

- int numberofno (used to store the number of training data examples which have negative result)
- String domain (stores the name of the training data domain)
- Hashtable attributevalues (This hashtable is filled with AttributeValue objects according to all possible values that the different attributes of the current training data can have. String objects where used as keys. AttributeValue objects count how often they appear in the training data with a positive and how often with a negative result.)

… and the following methods:
- learn(Node,Node) (fills up the attributevalues hashtable with AttributeValues objects and brings those to count their number of appeareances depending on the result.)
- calculate(Vector) (calculates, according to the Strings in the given Vector, the two classifiers, for the according AttributeValue objects, and the hypothesis. These classifiers and the hypothesis were put into a Panel and returned)

note: This is the typical estimation of P(aivj):




Where
n: examples with v=v;
p is prior estimate for P(aivj)
nc: examples with a=ai,
m is an empirical smoothing factor depending on the training data

Since I liked to make the application useable for different kinds of training data, I chose the following estimation, which is sufficient, if the number of training datasets is not very small.
Used Estimation of P(aivj):
P(ai vj) <= nc / n if (nc =0) => take 1 instead of nc

AttributeValue:
This class has these attributes:
- String name (stores the String which describes this AttributeValue object)
- int counteryes (stores the number of appearances of this AttributeValue object on positve results)
- int counterno (stores the number of apperances of this AttributeValue object on negative results)
and these methods:
- countyes() (increments counteryes)

- countno() (increments counterno)
- as well as get methods for counteryes and counterno.

No comments:

Post a Comment