Dev data warning:

You are currently visiting the dev server for the IMPC. This site is classed as unstable but is displaying our test data. To access the live data from the IMPC please visit

Help & Documentation

Phenodigm scores

Phenodigm scores capture the similarity between a knockout mouse and a human disease, based on phenotypic similarity. The algorithm is described in detail here.

In brief, the algorithm proceeds in two stages. The first stage compares individual disease phenotypes to individual mouse phenotypes using a cross-ontology measure (MP and HP terms). The outcome of each comparison depends on the semantic similarity of the terms, as well as their prevalence. Thus, pairs of phenotypes that are biologically similar and relatively rare (e.g. human ‘Ataxia’ and mouse ‘ataxia’) obtain a high score, while pairs that refer to different biological entities and are quite vague (e.g. human ‘Abnormality of the cardiovascular system’ and mouse ‘adipose tissue phenotype’) obtain a very low score. Scores from this stage fall in the range [0, infinity], although typical values are [0, 6].

When comparing mouse and disease profiles, all disease phenotypes are matched with a corresponding phenotype in the mouse. Average and maximum scores are displayed in the summary tables.

The second stage of the calculation aggregates pairwise scores into a single value. Conceptually, this is a combination of the average and maximum scores. However, there is an additional normalization step, which is described in detail in the publication. The overall outcome is the Phenodigm score, which, by definition, falls in the range [0, 100].

All tables are defaultly sorted by the Phenodigm score.

The IMPC Newsletter

Get highlights of the most important data releases, news and events, delivered straight to your email inbox

Subscribe to newsletter