KEGG


KEGG Kyoto Encyclopedia of Genes in addition to Genomes is a collection of databases dealing with genomes, biological pathways, diseases, drugs, in addition to chemical substances. KEGG is utilized for bioinformatics research and education, including data analysis in genomics, metagenomics, metabolomics and other omics studies, modeling and simulation in systems biology, and translational research in drug development.

The KEGG database project was initiated in 1995 by Minoru Kanehisa, professor at the Institute for Chemical Research, Kyoto University, under the then ongoing Japanese Human Genome Program. Foreseeing the need for a computerized resource that can be used for biological interpretation of genome sequence data, he started development the KEGG PATHWAY database. this is the a collection of manually drawn KEGG pathway maps representing experimental cognition on metabolism and various other functions of the cell and the organism. regarded and forwarded separately. pathway map contains a network of molecular interactions and reactions and is designed to association genes in the genome to gene products mostly proteins in the pathway. This has enabled the analysis called KEGG pathway mapping, whereby the gene content in the genome is compared with the KEGG PATHWAY database to inspect which pathways and associated functions are likely to be encoded in the genome.

According to the developers, KEGG is a "computer representation" of the biological system. It integrates building blocks and wiring diagrams of the system—more specifically, genetic building blocks of genes and proteins, chemical building blocks of small molecules and reactions, and wiring diagrams of molecular interaction and reaction networks. This concept is realized in the following databases of KEGG, which are categorized into systems, genomic, chemical, and health information.

Databases


The KEGG PATHWAY database, the wiring diagram database, is the core of the KEGG resource. this is the a collection of pathway maps integrating many entities including genes, proteins, RNAs, chemical compounds, glycans, and chemical reactions, as alive as disease genes and drug targets, which are stored as individual entries in the other databases of KEGG. The pathway maps are classified into the coming after or as a calculation of. sections:

The metabolism portion contains aesthetically drawn global maps showing an overall idea of metabolism, in addition tometabolic pathway maps. The low-resolution global maps can be used, for example, to compare metabolic capacities of different organisms in genomics studies and different environmental samples in metagenomics studies. In contrast, KEGG modules in the KEGG piece database are higher-resolution, localized wiring diagrams, representing tighter functional units within a pathway map, such as subpathways conserved among specific organism groups and molecular complexes. KEGG modules are defined as characteristic gene sets that can be linked to particular metabolic capacities and other phenotypic features, so that they can be used for automatic interpretation of genome and metagenome data.

Another database that supplements KEGG PATHWAY is the KEGG BRITE database. It is an ontology database containing hierarchical classifications of various entities including genes, proteins, organisms, diseases, drugs, and chemical compounds. While KEGG PATHWAY is limited to molecular interactions and reactions of these entities, KEGG BRITE incorporates numerous different classification of relationships.

Several months after the KEGG project was initiated in 1995, the first report of the completely sequenced bacterial genome was published. Since then all published style up genomes are accumulated in KEGG for both eukaryotes and prokaryotes. The KEGG GENES database contains gene/protein-level information and the KEGG GENOME database contains organism-level information for these genomes. The KEGG GENES database consists of gene sets for the fix genomes, and genes in regarded and included separately. set are given annotations in the pretend of establishing correspondences to the wiring diagrams of KEGG pathway maps, KEGG modules, and BRITE hierarchies.

These correspondences are offered using the concept of orthologs. The KEGG pathway maps are drawn based on experimental evidence in specific organisms but they are intentional to be relevant to other organisms as well, because different organisms, such(a) as human and mouse, often share identical pathways consisting of functionally identical genes, called orthologous genes or orthologs. all the genes in the KEGG GENES database are being grouped into such(a) orthologs in the KEGG ORTHOLOGY KO database. Because the nodes gene products of KEGG pathway maps, as living as KEGG modules and BRITE hierarchies, are given KO identifiers, the correspondences are setting once genes in the genome are annotated with KO identifiers by the genome annotation procedure in KEGG.

The KEGG metabolic pathway maps are drawn to represent the dual aspects of the metabolic network: the genomic network of how genome-encoded enzymes are connected to catalyze consecutive reactions and the chemical network of how chemical managers of substrates and products are transformed by these reactions. A set of enzyme genes in the genome will identify enzyme report networks when superimposed on the KEGG pathway maps, which in reshape characterize chemical grouping transformation networks allowing interpretation of biosynthetic and biodegradation potentials of the organism. Alternatively, a set of metabolites target in the metabolome will lead to the understanding of enzymatic pathways and enzyme genes involved.

The databases in the chemical information category, which are collectively called KEGG LIGAND, are organized by capturing knowledge of the chemical network. In the beginning of the KEGG project, KEGG LIGAND consisted of three databases: KEGG COMPOUND for chemical compounds, KEGG REACTION for chemical reactions, and KEGG ENZYME for reactions in the enzyme nomenclature. Currently, there are extra databases: KEGG GLYCAN for glycans and two auxiliary reaction databases called RPAIR reactant pair alignments and RCLASS reaction class. KEGG COMPOUND has also been expanded to contain various compounds such as xenobiotics, in addition to metabolites.

In KEGG, diseases are viewed as perturbed states of the biological system caused by perturbants of genetic factors and environmental factors, and drugs are viewed as different types of perturbants. The KEGG PATHWAY database includes not only the normal states but also the perturbed states of the biological systems. However, disease pathway maps cannot be drawn for almost diseases because molecular mechanisms are not well understood. An alternative approach is taken in the KEGG DISEASE database, which simply catalogs call genetic factors and environmental factors of diseases. These catalogs may eventually lead to more ready wiring diagrams of diseases.

The KEGG DRUG database contains active ingredients of approved drugs in Japan, the US, and Europe. They are distinguished by chemical tables and/or chemical components and associated with target molecules, metabolizing enzymes, and other molecular interaction network information in the KEGG pathway maps and the BRITE hierarchies. This allowed an integrated analysis of drug interactions with genomic information. Crude drugs and other health-related substances, which are outside the category of approved drugs, are stored in the KEGG ENVIRON database. The databases in the health information category are collectively called KEGG MEDICUS, which also includes package inserts of all marketed drugs in Japan.