|
|
|
|
About CORUMProtein complexes are key molecular entities that integrate multiple gene products to perform
cellular functions. The CORUM database is a collection of experimentally verified mammalian protein
complexes.
COMPLEX NAMEWe use the complex names given in the literature including synonyms. An example is the
eukaryotic chaperonin CCT (chaperonin containing TCP-1), that is also well known as TRiC (TCP-1
ring complex). If no name is found for a protein complex, we define one which is usually composed
of gene names of the complex, e.g. ‘BRCA1-RAD51 complex’ or ‘Ubiquitin E3 ligase (FBXW7, CUL1,
SKP1A, RBX1)’.
ORGANISMThe majority of protein complexes in CORUM originates from man (65%), followed by mouse (14%)
and rat (14%).
SUBUNITSThe subunits of protein complexes are annotated according to the respective UniProt entries.
In CORUM only the primary accessions are stored as identifiers. Associated information like gene
names and protein names is retrieved via the BioRS sequence retrieval system, providing up-to-date
information from the primary data sources.
Frequently, the molecular characterization of the complex composition is limited to the
identification of the subunits. For cases where the stoichiometry of the subunits has been
analysed, the information is given in the ‘Number of subunits’ field (see e.g. complex 960).
For species like rat, pig or sheep some proteins are not found in UniProt. In such cases
orthologs from related organisms are used and the substitutions are mentioned in the comment
field.
In some articles, the description of certain protein complex subunits is ambiguous. This might
occur if at the time of the experiments, only one variant of the protein was known or if several
very similar proteins exist and the authors did not determine which isoform or variant was part of
the complex. In such cases we collect all possible protein entries and mark them in the status
field with ‘nd’ which stands for ‘not determined’. If variants exist for more than one subunit of a
protein complex the individual variants are differentiated by nd1, nd2, nd3 etc.
For the complex subunits homologous proteins from mouse are also provided. These are retrieved
from our MfunGD database (http://mips.gsf.de/genre/proj/mfungd/). CORUM
and MfunGD are cross-linked to each other.
PURIFICATION METHODThe experimental method which was used to purify the protein complex is annotated according to
the PSI-MI standard. The PSI consortium provides a list of methods
(http://www.psidev.info/).
FUNCTIONAL CHARACTERIZATIONWe use the Functional Catalogue (FunCat) annotation scheme for protein complex function
characterization. The hierarchical structure of FunCat allows browsing for protein complexes with
particular cellular functions or localizations. Examples of such sub-datasets are presented on the
CORUM home page. Detailed results of such queries are also available via the
Browse FunCat search tool on the search penal of the web pages.
The evidence for assigning a functional category is given in a separate field. There are five
different evidences that include different qualities: (i) experimental evidence (exp), (ii)
evidence from literature like reviews (lit), (iii) known mammalian homolog (kmh), (iv)
high-throughput experiment (htp) and (v) predicted function (pred). For all evidences but predicted
annotation the corresponding PubMed references are provided.
COMMENTAdditional information like disease relevance or more detailed information about the cellular
function of protein complexes is given in the comment field.
PMIDIn this field the PMID of the article is given, where the members of the complex have been
characterized as constituents of the complex.
|
|||||||||||||||
|
© 2003 GSF - Forschungszentrum für Umwelt und Gesundheit, GmbH Ingolstädter Landstraße 1, D-85764 Neuherberg Disclaimer: |