MEGA TOOL Dataset Selection

Balloons data, 4 attributes, 20 data instances ( more info… )

Carsdata, 6 attributes, 1728 data instances ( more info… )

Balance data, 4 attributes, 625 data instances ( more info… )

In order to use your own data the file you use must be a CSV file with the following format…

Attribute1NAME : Attribute1VAL1, Attribute1VAL2
Attribute2NAME : Attribute2VAL1, Attribute2VAL2
TargetAttributeNAME : TargetAttributeVAL1, AttributeVAL2

Attribute1DataVal1, Attribute2DataVal1, TargetAttributeDataVal1
Attribute1DataVal2, Attribute2DataVal2, TargetAttributeDataVal2

So basically at the start of the file is the attribute definitions. Firstly is the attribute name, then a colon, then the different possible values separated by commas. After all the attribute definitions, the data follows. Whitespace is stripped out, so don’t worry about that. An example of a dataset would be:

Att1: 1,2,3,4,5
Att2: 1,2,3,4,5
Att3: 1,2,3,4,5
Att4: false, true
Target: L, B, R


This is a sample file with 3 attributes and 1 target attribute ( named “target” ). Then there are 5 data rows.

All data has to be categorical data. The tool will not understand any numerical data and will just treat it as categorical.

In order to use a data file with the tool you need to upload it somewhere on the web and then go to the tool page passing in the url to the data file with the “file” parameter.

So for example, the links above take you to the tool with whatever datafile is specified.


Here we use the “file” parameter on the URL to pass in the location of the cars datafile.

At some point I may add some more datafiles or allow direct upload of a file to the tool. Thankyou!

Leave a Reply

Your email address will not be published. Required fields are marked *

* Copy This Password *

* Type Or Paste Password Here *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>