http://www.markwatson.com/. Lots of good stuff in this, but the bit that really got my attention was how easy and clean the explanation was for how to embed Weka into a Java application.
If you are not familiar with Weka, I highly recommend that you go here to learn more about it: http://www.cs.waikato.ac.nz/ml/weka/
In summary, the steps to use and modify the example were as follows:
- Create a project
- Add weka.jar to the build path
- Copy and tweak the code from the example
- Examine results
Here is a screen shot of adding weka.jar to the project:
Here is the Java code (modified only slightly from the example referenced above):
import weka.classifiers.meta.FilteredClassifier;
import weka.classifiers.trees.ADTree;
import weka.classifiers.trees.J48;
import weka.core.Instances;
import weka.filters.unsupervised.attribute.Remove;
import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
public class main {
public static void main(String[] args) throws Exception {
Instances training_data = new Instances(new BufferedReader(
new FileReader("test_data/weather.arff")));
training_data.setClassIndex(training_data.numAttributes() - 1);
Instances testing_data = new Instances(new BufferedReader(
new FileReader("test_data/weather.arff")));
testing_data.setClassIndex(training_data.numAttributes() - 1);
String summary = training_data.toSummaryString();
int number_samples = training_data.numInstances();
int number_attributes_per_sample = training_data.numAttributes();
System.out.println("Number of attributes in model = "
+ number_attributes_per_sample);
System.out.println("Number of samples = " + number_samples);
System.out.println("Summary: " + summary);
System.out.println();
// J48 j48 = new J48();
ADTree adt = new ADTree();
Remove rm = new Remove();
rm.setAttributeIndices("1");
FilteredClassifier fc = new FilteredClassifier();
fc.setFilter(rm);
fc.setClassifier(adt);
fc.buildClassifier(training_data);
for (int i = 0; i < testing_data.numInstances(); i++) {
double pred = fc.classifyInstance(testing_data.instance(i));
System.out.print("given value: "
+ testing_data.classAttribute().value(
(int) testing_data.instance(i).classValue()));
System.out.println(". predicted value: "
+ testing_data.classAttribute().value((int) pred));
}
}
}
Here are the results:
I used the ADTree classifier on the weather.arff demo data. This is a very simple example, and I hope in the future to go into more detail about how machine learning tools like Weka can be used as part of an agent based programming approach.