Feature selection using Expected Attainable Discrimination

D. R. Lovell, C. R. Dance, M. Niranjan, R. W. Prager, K. J. Dalton and R. Derom

We propose expected attainable discrimination (EAD) as a measure to select discrete valued features for reliable discrimination between two classes of data. EAD is an average of the area under the ROC curves obtained when a simple histogram probability density model is trained and tested on many random partitions of a data set. EAD can be incorporated into various stepwise search methods to determine promising subsets of features, particularly when misclassification costs are difficult or impossible to specify. Experimental application to the problem of risk prediction in pregnancy is described.

Keywords: Receiver operating characteristic (ROC); Area under the ROC curve; Feature selection; Risk prediction in pregnancy; Failure to progress.