1-1hit |
Sombut FOITONG Ouen PINNGERN Boonwat ATTACHOO
Feature selection (FS) plays an important role in pattern recognition and machine learning. FS is applied to dimensionality reduction and its purpose is to select a subset of the original features of a data set which is rich in the most useful information. Most existing FS methods based on rough set theory focus on dependency function, which is based on lower approximation as for evaluating the goodness of a feature subset. However, by determining only information from a positive region but neglecting a boundary region, most relevant information could be invisible. This paper, the maximal lower approximation (Max-Certainty) – minimal boundary region (Min-Uncertainty) criterion, focuses on feature selection methods based on rough set and mutual information which use different values among the lower approximation information and the information contained in the boundary region. The use of this idea can result in higher predictive accuracy than those obtained using the measure based on the positive region (certainty region) alone. This demonstrates that much valuable information can be extracted by using this idea. Experimental results are illustrated for discrete, continuous, and microarray data and compared with other FS methods in terms of subset size and classification accuracy.