What’s an orphan?

An orphan enzyme is an enzyme activity that has been experimentally characterized but for which we lack amino-acid or nucleotide sequence data.

When we say that an enzyme activity has been “experimentally characterized” we mean that at a minimum its reactants and products are known. In most cases, researchers have also identified a wealth of other data such as molecular weights, isoelectric points, reaction kinetics, and more.

Orphan enzymes lack amino-acid or nucleotide sequence data in any of major sequence resources. Sometimes we can find sequence data buried in old papers or patents. This has been true in about 25% of the orphans we’ve looked at so far. However, until those sequences are recovered, these orphans are just as lost to modern biology as those enzymes that have never been sequenced (the other 75%).

Why do orphan enzymes matter?

We live in an exciting era where whole genomes can be fully sequenced within 15 minutes after an organism or pathogen is isolated. However, assigning gene functions to these sequences with lab research is unable to keep pace with this sequencing output. As a result, a significant fraction of the genes in many new genomes have no function assigned to them.

We try to make up for this problem by guessing what the genes in a newly sequenced genome do based on all the proteins and genes for which we have sequences already. As a result, each orphan enzyme is a knowledge gap, a place where we cannot use sequence to predict the function of genes. The quickest way to improve our genome annotations is to fill in these gaps in our knowledge by finding sequences for orphan enzymes.

When orphan enzymes are left unresolved, they lead to incomplete annotations and wasteful repetition of work in the lab. After all, each orphan enzyme already represents several hundred thousand dollars in research (in 2014 dollars).


How many orphan enzymes are there?

We don’t know the full scale of the orphan enzyme problem – but it’s pretty big.

When the Orphan Enzymes Project (OEP) started in 2009, we first looked at “just” the 4,400 enzymes with assigned Enzyme Commission (EC) numbers. Of those, 1,122 were orphan enzymes!

However, there are many enzymes that have not yet been classified into the EC system. It is likely that just like those enzymes within the EC system, about a quarter (or more!) of these unclassified enzymes are also orphans.