Decision Trees and Data Mining Software

There is quite a lot of commercial and open-source decision trees and data mining software out there. The code used in this site in the tutorials is basically the decision tree algorithm id3 (plus a few other extras) written in javascript and DHTML although it isnt automated as the idea is that the user is supposed to follow through the steps of the algorithm themselves in order to see how it works.

Please add to this page if you have information on data mining software. Im particularly interested in the commerical stuff if anyone has experiences of using it.

WEKA

http://www.cs.waikato.ac.nz/ml/weka/

Waikato Environment for Knowledge Analysis. A sophisticated software suite produced to accompany a comprehensive introductory level book on the subject of data-mining (see Books section).
A wide range of data-mining and machine learning techniques are available with WEKA including an implementation of the last public version of the influential C4.5 decision tree learner.
This can be used via the graphical interface which allows access to many features which include visualisation, and analysis of the many data-mining algorithms available. The decision trees are constructed automatically once parameters have been specified and are displayed in ASCII text. There exists several sample datasets as well as the possiblity of using custom data. The functionality of the suite is fully documentated along with the available source code and is intended (although not essentially) to be used with the text book. WEKA is a useful package that allows comparison of different machine learning techniques. It is also open source and is written in Java.

C4.5

http://www.cse.unsw.edu.au/~quinlan/

Link to Ross Quinlan's home page where the C++ source code for C4.5 (and also FOIL - inductive logic programming software) can be downloaded from. C4.5 is a powerful, state-of-the-art decision tree learning algorithm. It can handle numerical data and uses sophisticated splitting and pruning techniques.

See5

http://rulequest.com

The commercial successor to C4.5. The software can be easily obtained from the website as a demo version which is limited by the fact that it cannot process more than 400 training or test instances. With a very clean, usable interface, See5 is extremely easy to use for the newcomer to the field but is sufficiently powerful for the expert to obtain excellent results with. The tool allows a range of data samples and can be loaded with user-specified data from a file of comma separated values. Test data, misclassfication costs and a range of options can be specified before a decision tree is constructed. These include pruning, winnowing, boosting and the ability to set fuzzy thresholds. The decision tree is generated as an ASCII representation with mis-classification rates. The tree can then be examined using new (user-defined) instances for it’s prediction accuracy and can also be cross-examined with instances from the training or test sets. Output can also be converted to a set of rules.

Orange

http://magix.fri.uni-lj.si/orange

Data-mining software suite. Similar to WEKA this has several machine learning algorithms available. As far as decision trees goes there is C4.5. You can add your own extensions using the Python scripting language. Works on both Windows and Linux/Unix.

C# id3 code

http://www.codeproject.com/csharp/id3.asp

LISP id3 code

http://www.cs.cmu.edu/afs/cs/project/theo-11/www/decision-trees.html

OC1

http://www.cs.jhu.edu/~salzberg/announce-oc1.html

A decision tree classifier written in C, intended for non-commerical use with data that has numeric attribute values.

ITI (Incremental Tree Inducer)

http://www-lrn.cs.umass.edu/iti/index.html

An incremental decision tree inducer written in C.

See5

What's the difference between See5 and C4.5? Has anyone used See5 much?

Re:See5

I think there are some minor differences like the rule generation is faster. Of course its comerical and so I only tried the demo version which is limited to 500 rows of data I think. Its very easy to use, give it a try.

Source code for Fuzzy Decision Tree

can you give me reference of free source code for fuzzy decision tree.

I only try one text mining

I only try one text mining tool, it is a freeware program for extraction of text from files of the next types: pdf, doc, rtf, chm, html without need to have installed any other programs.

Adverts

Adverts