Author ORCID Identifier

Document Type



Available under a Creative Commons Attribution Non-Commercial Share Alike 4.0 International Licence


Computer Sciences

Publication Details

Studies in Health Technology and Informatics


Introduction: We describe an analysis that modulates the simple population prevalence derived likelihood of a particular condition occurring in an individual by matching the individual with other individuals with similar clinical histories and determining the prevalence of the condition within the matched group.

Methods: We have taken clinical event codes and dates from anonymised longitudinal primary care records for 25,979 patients with 749,053 recorded clinical events. Using a nearest neighbour approach, for each patient, the likelihood of a condition occurring was adjusted from the population prevalence to the prevalence of the condition within those patients with the closest matching clinical history.

Results: For conditions investigated, the nearest method performed well in comparison with standard logistic regression.

Conclusions: Results indicate that it may be possible to use histories to identify 'similar' patients and thus to modulate future likelihoods of a condition occurring.