[Next] [Up] [Previous]
Next: References Up: A guide to the Previous: The proposed list

Choice of database

 

The QAMC project is fortunate to have access to a variety of large perinatal databases from all over Europe. Databases from Scotland, Baden-Würtenberg, Wales and Finland provide over 1.2 million case records. However, while it is universally agreed that collecting perinatal data is a good thing, there is considerably less agreement about what information to collect, and no agreement whatsoever about how that data should be stored. Consequently, these databases could not be usefully amalgamated and we had to decide on a particular dataset to work with.

The experiments described in [5,6,7] and the QAMC web page make use of the largest dataset available to us: the Scottish Morbidity Record (known as the SMR2) which contains 771 571 singleton births that occurred between 1980-91. Cases in which elective Caesarian section or breech presentation occurred were removed from the dataset[*]. For the training and testing of various risk prediction systems, these records were partitioned into a learning set of births occurring between 1980-88 (540 905), and a testing set of births occurring between 1989-91 (176 812). While we only have access to retrospective data, this partition was used to simulate a prospective trial of the prediction systems.



D.R. Lovell
Mon Sep 15 18:08:31 BST 1997