Design, construction and evaluation of systems to predict risk in obstetrics

D. R. Lovell, B. Rosario, M. Niranjan, R. W. Prager, K. J. Dalton, R. Derom and J. Chalmers

We present a systematic, practical approach to developing risk prediction systems, suitable for use with large databases of medical information. An important part of this approach is a novel feature selection algorithm which uses the area under the receiver operating characteristic (ROC) curve to measure the expected discriminative power of different sets of predictor variables. We describe this algorithm and use it to select variables to predict risk of a specific adverse pregnancy outcome: failure to progress in labour. Neural network, logistic regression and hierarchical Bayesian risk prediction models are constructed, all of which achieve close to the limit of performance attainable on this prediction task. We show that better prediction performance requires more discriminative clinical information rather than improved modelling techniques. It is also shown that better diagnostic criteria in clinical records would greatly assist the development of systems to predict risk in pregnancy.

Keywords: Receiver operating characteristic (ROC); Feature selection; Risk prediction in pregnancy; Failure to progress; Neural networks.