Data Sets

Data_2_2:

                This simple data is in 2-d space, and has 2 clusters.  The total number of  points is 10.
                    The format of the data file is
                            # points     #dimensions     
                            feature1  feature2 ...

                           If you use this data, please quote the following reference:

                           S. Bandyopadhyay and U. Maulik, ``An Evolutionary Technique Based on K-Means for
                           Optimal Clustering in R^N'', Information Sciences, vol. 146, pp. 221-237, 2002

                           U. Maulik and S. Bandyopadhyay, ``Genetic Algorithm-based Clustering Technique"
                           Pattern Recognition, vol.32, pp. 1455-1465, 2000

                           S. Bandyopadhyay and S. K. Pal, ``Classification and Learning Using Genetic
                           Algorithms: Applications in Bioinformatics and Web Intelligence", Springer,

                            Heidelberg, 2007


Data_3_2:

                This data is in 2-d space, and has 3 clusters.  The total number of  points is 76.
                    The format of the data file is
                            # points     #dimensions    #clusters
                            feature1  feature2 ...

                           If you use this data, please quote the following reference:


                           S. Bandyopadhyay and U. Maulik, ``Genetic Clustering for Automatic Evolution
                           of Clusters and Application to Image Classification'',  Pattern Recognition, vol.35,
                           pp. 1197-1208, 2002


                           U. Maulik and S. Bandyopadhyay, ``Genetic Algorithm-based Clustering Technique"
                           Pattern Recognition, vol.32, pp. 1455-1465, 2000

                           S. Bandyopadhyay and S. K. Pal, ``Classification and Learning Using Genetic
                           Algorithms: Applications in Bioinformatics and Web Intelligence", Springer,

                            Heidelberg, 2007

 

Data_5_2 or AD_5_2: This data is in 2-d space, and has 5clusters.  The total number of  points is 250.
                    The format of the data file is
                            # points     #dimensions    #clusters
                            feature1  feature2 ...  class_number

                           If you use this data, please quote the following references:

S. Bandyopadhyay and U. Maulik, ``Nonparametric genetic clustering: Comparison validity indices'',  IEEE Transactions on Systems, Man and Cybernetics, Part C, vol. 31, no. 1, pp. 120-125, 2001


                           S. Bandyopadhyay and U. Maulik, ``Genetic Clustering for Automatic Evolution
                           of Clusters and Application to Image Classification'',  Pattern Recognition, vol.35,
                           pp. 1197-1208, 2002

                           S. Bandyopadhyay and S. K. Pal, ``Classification and Learning Using Genetic
                           Algorithms: Applications in Bioinformatics and Web Intelligence", Springer,

                            Heidelberg, 2007

 

Data_6_2: This data is in 2-d space, and has 6clusters.  The total number of  points is 300.
                    The format of the data file is
                            # points     #dimensions    #clusters
                            feature1  feature2 ...  class_number

                           If you use this data, please quote the following reference:


                           S. Bandyopadhyay and U. Maulik, ``Genetic Clustering for Automatic Evolution
                           of Clusters and Application to Image Classification'',  Pattern Recognition, vol.35,
                           pp. 1197-1208, 2002

Data_4_3 or AD_4_3: This data is in 3-d space, and has 4clusters.  The total number of  points is 400.
                    The format of the data file is
                            # points     #dimensions    #clusters
                            feature1  feature2 ...  class_number

                           If you use this data, please quote the following references:

S. Bandyopadhyay and U. Maulik, ``Nonparametric genetic clustering: Comparison validity indices'',  IEEE Transactions on Systems, Man and Cybernetics, Part C, vol. 31, no. 1, pp. 120-125, 2001


                           S. Bandyopadhyay and U. Maulik, ``Genetic Clustering for Automatic Evolution
                           of Clusters and Application to Image Classification'',  Pattern Recognition, vol.35,
                           pp. 1197-1208, 2002

                           S. Bandyopadhyay and S. K. Pal, ``Classification and Learning Using Genetic
                           Algorithms: Applications in Bioinformatics and Web Intelligence", Springer,

                            Heidelberg, 2007

 

Data_10_2 or AD_10_2: This data is in 2-d space, and has 10 clusters.  The total number of  points is 500.
                    The format of the data file is
                            # points     #dimensions    #clusters
                            feature1  feature2 ... 

                           If you use this data, please quote the following references:

S. Bandyopadhyay and U. Maulik, ``Nonparametric genetic clustering: Comparison validity indices'',  IEEE Transactions on Systems, Man and Cybernetics, Part C, vol. 31, no. 1, pp. 120-125, 2001


                           S. Bandyopadhyay and U. Maulik, ``Genetic Clustering for Automatic Evolution
                           of Clusters and Application to Image Classification'',  Pattern Recognition, vol.35,
                           pp. 1197-1208, 2002

                           S. Bandyopadhyay and S. K. Pal, ``Classification and Learning Using Genetic
                           Algorithms: Applications in Bioinformatics and Web Intelligence", Springer,

                            Heidelberg, 2007

 

 

Data_9_2:  Also sometimes referred to as st900_2_9. This data is in 2-d space, and has 9 clusters.  The total number of  points is 900.
                        The format of the data file is
                            # points     #dimensions    #clusters
                            feature1  feature2 ... 

                         If you use this data, please quote the following references:

S. Bandyopadhyay, C. A. Murthy and S. K. Pal, ``Pattern Classification Using Genetic Algorithms'',  Pattern Recognition Letters, vol. 16, pp. 801-808, August 1995

 S. Bandyopadhyay and S. K. Pal, ``Classification and Learning Using Genetic Algorithms: Applications in Bioinformatics and Web Intelligence", Springer, Heidelberg, 2007