Case Studies

UMERSE and Child Behavioral Health

Investigators in Child Behavioral Health at the University of Michigan have been working to identify an unbiased sample of patients with disorders of sex development (DSD) at 4 investigation sites: University of Michigan, Children's Hospital of Philadelphia, Albany Medical College, and the University of Miami. To circumvent the potential for selection bias introduced by case ascertainment relying on idiosyncratic search algorithms, site co-investigators were supplied with a list of ICD-9 diagnostic codes to guide their search. Electronic systems used for billing or record keeping purposes at each site were used to construct a preliminary list of patients. This list was then cross-matched with demographic variables (e.g., patient date of birth) which are also often stored in such electronic databases.

Following this strategy, they quickly understood that ICD-9 diagnostic codes combined with demographic information could not, without extensive review of dictated notes, deliver a sample frame of eligible patients. For example, the ICD-9 code 752.61 (hypospadias) returned a vast number of 46,XY patients, only a subset of whom would be categorized as DSD according to the consensus statement that introduced the new nomenclature. In another example, they were uncertain whether a series of other ICD-9 codes would be used at all by providers for patients classified as DSD, including 752.4 (Abnormalities of cervix, vagina, and external female genitalia), 752.40 (Unspecified anomaly of cervix, vagina, and external female genitalia), 752.49 (Other anomalies of cervix, vagina, and external female genitalia), or 752.69 (Other penile anomalies). The process of culling the substantial number of, ultimately, ineligible cases was immensely time-consuming.

The investigators thus learned that although the ICD-9 codes helped narrow the pool of possible candidate patients, these codes were crude in that they did not accurately capture the nuances of a patient's diagnosis. Appreciating how labor intensive this stage of the research plan for the study ultimately turned out to be the principle investigator at the University of Michigan turned to UMERSE to search through the dictated notes and other electronic medical record (EMR) elements of those patients identified using a combination of ICD-9 codes and demographic qualifiers. The use of EMERSE validated the accuracy of research staff identified cases by generating an almost identical list of patients. It accomplished this task, however, in approximately one third of the time.