Dr. Sumeet Dua

Max P. & Robbie L. Watson Eminent Scholar Chair

  • Full Screen
  • Wide Screen
  • Narrow Screen
  • Increase font size
  • Default font size
  • Decrease font size

Feifei Xu (2007)

E-mail Print PDF

Unsupervised Feature Selection Filter Method Based on Information Gain; MS-CS Thesis, Student: Feifei Xu (2007).

Fast matured microarray technologies have allowed scientists to monitor and measure the gene expression levels of thousands of genes in a single experiment. But the high dimensionality of the microarray data has become a challenge to discrimination analysis. We need to find ways to reduce the dimensionality and keep the characteristics of the dataset. To this purpose, different feature selection methods have been developed. But most of those methods can only remove the irrelevant features and can not remove the redundant features. Therefore, the accuracy of the prediction is reduced by those redundant features.
We propose a novel unsupervised filter method, information gain based measurement (IGM), to select features. Redundancy is reduced in the feature selection process while more information of the original dataset is kept. Improvements are observed when we use the Kmeans method to test the features selected. We also get a high accuracy by using the feature selected via our method. Extensive experiments demonstrate the effectiveness of our method compared with existing methods.

You are here: Research Student Thesis