|
|
|
|
IntroductionBackgroundSystematic sequencing of genomes has resulted in substantial amounts of sequence data encoding for the structures of all biological macromolecules in a species. Traditionally, this information has been translated into sequence data representing genetic elements, i.e. the functional segments of the DNA such as coding open reading frames or exons, various kinds of RNAs, and last, but not least, promoter elements responsible for the condition dependent, controlled expression. The complex annotation associates biological information to sequence data. This substantial information can be inferred by the application of computational methods that is systematically applied like in the PEDANT sequence analysis package, but other, more specific information needs careful manual curation. Computational methods for example can transfer information from known sequence properties such as membrane spanning segments, secondary structures, folds, PFAM-domains, motifs and the like, whereas annotating the most important information such as sub-cellular location, protein/protein interaction, co-regulation, membership in pathways or cellular networks requires expert skills. Analysis of high-throughput experimental data needs advanced data structures that must be able to allow computational methods to access information beyond the individual genetic element. On the one hand, basic information as compiled by the generic public data collections must be included; on the other hand, dynamic, inferred data such as homology, fold, and protein/protein data have to be integrated. The conceptual challenge meets the technical one. For instance, one could ask that a system should be able to cope with the following examples of complex questions:
It is evident that more than some simple databases and CGI-scripts are required to achieve the answering of complex biological questions. A system capable to integrate various data sources and a flexible processing layer together with a convenient publishing interface is required. This is the foundation for GenRE a Genome Research Environment. |
|||||||||||||
|
© 2003 GSF - Forschungszentrum für Umwelt und Gesundheit, GmbH Ingolstädter Landstraße 1, D-85764 Neuherberg Disclaimer: |