Innovative research on the human genome paves way for new drugs


Published 15th May 2018

Leading international protein researchers have mapped two large unexplored parts of the human genome. Their work paves the way for the development of new drugs. The study builds on extensive data analysis conducted using super computers – a technique called data mining – has examined huge amounts of literature within the health and medical sciences and other evidence sources in order to identify the most and least studied proteins for drug targets. The study is the first to provide a solid, comprehensive and useful picture of all the proteins that can be used to develop new drugs.

The researchers included 20,000 proteins in the study and are now able to conclude that 8,000 of these more or less have not been mapped and studied by researchers or pharmaceutical companies. This paves the way for new drug research with great untapped potential. The research was published in the journal Nature Reviews Drug Discovery.

‘We have used highly advanced computer analysis of data to shed light on the parts of the human genome that are rarely researched. We can see that they hold great potential, and we hope the analysis can motivate drug researchers to do some pioneer work. This may prove significant to future drug innovation’, says Professor Søren Brunak from the Novo Nordisk Foundation Center for Protein Research.

Many diseases are caused by dysfunctional proteins that have been damaged by genetic flaws. The vast majority of drugs try to prevent these proteins from being active and thus to reduce their impact on the disease in question.

It is therefore vital to drug development to be able to study and identify the proteins that are instrumental in diseases. Proteins with great potential are often referred to as drug targets and may after extensive clinical trials be approved for use as drugs.

The development of new drugs focusses on only 60 percent of the potential drug targets

This new study shows that 40 percent of all potential drug proteins have not been subjected to thorough and prioritised study. The researchers have therefore divided the 20,000 proteins into four categories and ranked their potential as future drugs.

According to the analysis, the mapping also paves the way for new so-called re-positioning opportunities, where already approved drugs can be tested on new factors. This means that proteins in drugs only approved for one therapy area now can be tested for treatment of other diseases.

The combination of categorisation and rankings works almost like a treasure map for drugs, and therefore the project has also received funding for stage two.

‘In stage two of the project we aim to improve our tools for studying the biological functions of drug targets based both on scientific texts and on large experiments”, says Professor Lars Juhl Jensen, who is responsible for the sequencing of millions of articles using advanced data mining techniques.

Since the 1990s researchers affiliated to The Human Genome Project have tried to map the human genome, and in 2014 the National Institutes of Health Common Fund took steps to mapping the genes in the human genome that code for proteins through the project Illuminating the Druggable Genome.

The researchers initially believed that more than 100,000 genes were able to code for proteins, but the mapping showed that there is only around 20,000. The drugs available today relate to less than 1,000 drug targets. According to Søren Brunak, the potential of drug design based on these proteins is almost exhausted, and exploring new territory is therefore important.

The objective of sequencing the human genome is typically to determine which genes are related to a given disease. Here the researchers look at specific gene patterns in families and entire population groups to determine what causes certain diseases.


The IMPC has been directly involved in this research, with production centres actively prioritising knockout mouse strains for druggable genes. Through this collaboration, a total of 568 new knockout strains have been produced. Of these strains, 80% have phenotype data available. This study discusses some of the applications of these strains, but also highlights that in the coming years with further production of relevant IMPC strains, along with phenotype data, further insight will be possible into drug target genes.

Terry Meehan, a member of IMPC and an author on this study, said: “This is an exciting example of NIH initiatives working together to provide new insights to benefit medicine. The IMPC prioritised making knockout strains for druggable genes that the IDG identified as being poorly characterised. By seeing what phenotypes result when a gene is turned off, we can help researchers predict the beneficial and harmful effects that might result from a drug that neutralises the product of the same gene.”

Most of this post is re-published under Creative Commons from the Faculty of Health and Medical Sciences, University of Copenhagen:


Published 15th May 2018