The Negatome is a collection of protein and domain pairs which are unlikely engaged in direct physical interactions. The database currently contains experimentally supported non-interacting protein pairs derived from two distinct sources: by manual curation of literature and by analysing protein complexes from the PDB. More stringent lists of non-interacting pairs were derived from these two datasets by excluding interactions from IntAct. It can be used to evaluate newly derived experimental interactions. The Negatome is much less biased towards functionally dissimilar proteins than the negative data derived by randomly selecting proteins from different cellular locations. Thus, the negatome is complementary to such random data for training protein interaction prediction algorithms.
Negatome 2.0 was created by use of an advanced text mining procedure to guide the manual annotation process. Potential non-interactions were proposed by a modified version of a text mining tool called Excerbt. Compared to the first version the contents of the database have grown by over 300%.
The supplement to the Negatome 1.0 paper can be downloaded here.
The supplement to the Negatome 2.0 paper can be downloaded here .
|Dataset||Derived from||Description||Number of Pairs|
|Manual||Manual literature annotation||Manually annotated literature data describing the lack of protein interaction. High-throughput data are not included. The data is restricted only to mammalian proteins.||2171|
|Manual-stringent||Manual||The Manual dataset filtered against the IntAct dataset||1991|
|Manual-PFAM||Manual-stringent||PFAM domain pairs found in the Manual dataset filtered using iPFAM and 3did||1453|
|PDB||The PDB database||Protein pairs that are members of at least one structural complex but do not interact directly. Organism of origin is not restricted||4397|
|PDB-stringent||PDB||The PDB dataset filtered against the IntAct dataset.||4161|
|PDB-PFAM||PDB-stringent||Non-interacting PFAM domains found in the same structural complex filtered using iPFAM and 3did||1234|
|Combined||Manual and PDB||A combined non-interacting Protein dataset||6532|
|Combined-stringent||Manual-stringent and PDB-stringent||A combined stringent non-interacting Protein dataset||6136|
|Combined Pfam||Manual-PFAM and PDB-PFAM||A combined non-interacting Protein domain dataset||2681|
This work was partially funded by the Biosapiens Network of Excellence, the DFG International Research Training Group 'Regulation and Evolution of Cellular Systems' (GRK 1563), and the Joint Technology Platform within the Helmholtz Alliance for Systems Biology and the Federal Ministry of Education, Science, Research and Technology (NGFN: 01GR0451, SysMBo, FKZ: 0315494A).