-
Number of genome sequence projects of various organisms have resulted in
generation of a large amount of gene and protein sequence information. The focus
is now on the identification and functional characterization of proteins encoded
by these genomes.
-
Enzyme Commission number (EC number) is a numerical classification scheme for
enzymes, based on the chemical reactions they catalyze.
-
EC numbers represent enzymes and enzyme genes (genomic information), but they
are also utilized as identifiers of enzymatic reactions (chemical information).
-
The scheme is a hierarchical organization of enzyme reactions into six main
classes i.e. oxidoreductases, transferases, hydrolases, lyases, isomerases and
ligases which are then further split at three hierarchical levels
-
Due to the recent efforts of structural genomics initiatives a large and growing
number of enzymes have no functional annotation whilst Experimental functional
characterization is time-consuming and expensive.
-
High-precision EC number assignment is of utmost importance for studies such as
metabolic pathway reconstruction, understanding evolutionary relationships in
pathways and metabolite prediction, etc.
-
Hence there is a vital requisite for improved computational techniques for
precise prediction and assignment of EC number.
-
Here, we present ECpred: a predictive model to assign EC number to the
enzyme with unidentified function using two supervised machine learning
approaches k Nearest Neighbor (k-NN) and Probabilistic Neural Network (PNN). The
final prediction is made on the basis of a consensus of the predictions made by
selected algorithm and a probability is assigned to it.
-
ECpred classifies an enzyme in one of 3349 EC numbers.
|