The European Nucleotide Archive (ENA) captures and presents information relating to experimental workflows that are based around nucleotide sequencing. A typical workflow includes the isolation and preparation of material for sequencing, a run of a sequencing machine in which sequencing data are produced and a subsequent bioinformatic analysis pipeline. ENA records this information in a data model that covers input information (sample, experimental setup, machine configuration), output machine data (sequence traces, reads and quality scores) and interpreted information (assembly, mapping, functional annotation). Data arrive at ENA from a variety of sources. These include submissions of raw data, assembled sequences and annotation from small-scale sequencing efforts, data provision from the major European sequencing centres and routine and comprehensive exchange with our partners in the International Nucleotide Sequence Database Collaboration (INSDC). ENA is made up of a number of distinct databases that includes EMBL-Bank, the newly established Sequence Read Archive (SRA) and the Trace Archive each with their own data formats and standards. ... [Information of the supplier]
DNAtraffic database is dedicated to be an unique comprehensive and richly annotated database of genome dynamics during the cell life. DNAtraffic contains extensive data on the nomenclature, ontology, structure and function of proteins related to control of the DNA integrity mechanisms such as chromatin remodeling, DNA repair and damage response pathways from eight model organisms commonly used in the DNA-related study: Homo sapiens (human), Mus musculus (mouse), Drosophila melanogaster (fruit fly), Caenorhabditis elegans (nematode), Saccharomyces cerevisiae (budding yeast), Schizosaccharomyces pombe (fission yeast), Escherichia coli K-12, Arabidopsis thaliana (mouse-ear cress) DNAtraffic contains comprehensive information on diseases related to the assembled human proteins. Database is richly annotated in the systemic information on the nomenclature, chemistry and structure of the DNA damage and drugs targeting nucleic acids and/or proteins involved in the maintenance of genome stability. One of the DNAtraffic database aim is to create the first platform of the combinatorial complexity of DNA metabolism pathway analysis. Database includes illustrations of pathway, damage, protein and drug. Since DNAtraffic is designed to cover a broad spectrum of scientific disciplines it has to be extensively linked to numerous external data sources. Database represents the result of the manual annotation work aimed at making the DNAtraffic database much more useful for a wide range of systems biology applications. DNAtraffic database is freely available and can be queried by the name of DNA network process, DNA damage, protein, disease, and drug. ... [Information of the supplier]
The Multi-purpose Automated Genome Project Investigation Environment (MAGPIE, http://magpie.ucalgary.ca/) is a software package for the automated curation and presentation of DNA and protein sequences. MAGPIE distills the results of multiple database searches for each sequence (e.g. expressed sequence tags, ESTs) or subsequence (e.g. open reading frames, ORFs, in bacterial genomic DNA) into a summary page that facilitates interpretation of genes' biological roles and origins. Database search tool used with MAGPIE include the standard NCBI-BLAST set of searches, plus more sensitive Smith-Waterman searches for frameshift prone data such as ESTs. Hidden Markov Model searches are also performed, including against an in-house HMM library derived from the NCBI's Clusters of Orthologuous Genes (COGs) database, including eukaryotic and prokaryotic clusters. HMM searches often provide single, concise, and accurate data for functional assignment. ... [Information of the supplier]
MfunGD provides a resource for annotated mouse proteins and their occurence in protein networks. Protein function annotation is performed using the Functional Catalogue (FunCat) annotation scheme, which is a hierarchically structured classification system (Ruepp et al., NAR, 2004). To provide up-to-date similarity search results and InterPro domain analyses, the protein entries are interconnected with the SIMAP database (Rattei et al., NAR, 2006). The gene models are based on the RefSeq mouse cDNAs (Pruitt et al., NAR, 2007) The work of our group is focussed on the annotation of biological systems. Therefore, results from the Mammalian Protein-Protein Interaction Database ( MPPI, (Pagel et al., Bioinformatics, 2005 )) and the Comprehensive Resource of Mammalian Protein Complexes (CORUM, (Ruepp et al., NAR, 2007)) are linked to the MfunGD dataset. Links to external resources are also provided (e.g. Refseq, Uniprot, UCSC Genome Browser). MfunGD is implemented in GenRE, a J2EE based component oriented multi-tier architecture (Mewes et al., NAR, 2006). ... [Information of the supplier]
The Death domain (DD) superfamily is one of the largest classes of protein interaction modules and plays a pivotal role in the apoptosis, inflammation, necrosis, and immune cell signaling pathways. Critical caspase activating complexes in the apoptosis and inflammation signaling pathways are assembled via the DD superfamily. These domains are also involved in recruiting downstream effectors for immune cell receptor signaling, intracellular pathogen sensing, and response to DNA damage. To stimulate future researches among scientists who are interested in the DD superfamily mediated signaling pathway, we have developed the Death Domain Database, a manually curated database that aims to provide comprehensive information on PPIs of human DD superfamily. The current version of the Death Domain Database documents 175 PPI pairs among 99 DDS by curating 295 peer-reviewed publications. The Death Domain Database provides a detailed summary of PPI data, which fits into 3 categories: interaction, characterization, and functional role. Users can find in-depth information specified in the literature on relevant analytical methods, structural information. The Death Domain Database has a user-friendly interface with several helpful features, including a search engine, an interaction map, and a function for cross-referencing useful external databases. Our Death Domain Database will provide a valuable tool to assist in understanding and organizing the molecular interaction network of the DD superfamily. ... [Information of the supplier]
No other database than ESTHER holds all alpha/beta hydrolase fold proteins together: Interpro, Prosite, Pfam, have multiple entries for subsets of this structural superfamily. A table Synthese shows the correspondance between these database entries and the subfamilies in ESTHER. The ESTHER Table is now a little to big to be usefull. Each file contains one of the 31219 non redundant proteins/genes. The tables grouped in the family table, the syntheses table or the structure table may be more usefull. The Gene_locus nomenclature for these non-redundant entries is a name with 5 characters for the organisms (3 for genera, 2 for the species, except when a common 5 character name exists. ex: ratno is for Rattus norvegicus and human for man. This allows us to keep close to the Swiss-Prot nomenclature). The last characters define the protein, ex: human-acche represents human acetylcholinesterase. ... [Information of the supplier]
SUPERFAMILY is a database of structural and functional annotation for all proteins and genomes. The SUPERFAMILY annotation is based on a collection of hidden Markov models, which represent structural protein domains at the SCOP superfamily level. A superfamily groups together domains which have an evolutionary relationship. The annotation is produced by scanning protein sequences from over 2,414 completely sequenced genomes against the hidden Markov models. For each protein you can: a) Submit sequences for SCOP classification; b) View domain organisation, sequence alignments and protein sequence details For each genome you can: a) Examine superfamily assignments, phylogenetic trees, domain organisation lists and networks; b) Check for over- and under-represented superfamilies within a genome For each superfamily you can: a) Inspect SCOP classification, functional annotation, Gene Ontology annotation, InterPro abstract and genome assignments; b) Explore taxonomic distribution of a superfamily across the tree of life All annotation, models and the database dump are freely available for download to everyone. SUPERFAMILY is a member of the InterPro consortium of protein annotation databases, and has been integrated into the Ensembl eukaryotic genome project and The Arabidopsis Information Resource. To date, the SUPERFAMILY publications have been cited over 1,000 times. SUPERFAMILY has been used in structural, functional, evolutionary and phylogenetic research projects. ... [Information of the supplier]
M phase, also called as cell division, is the most crucial and fundamental affair of the eukaryotic cell cycle. After the chromosomes have been replicated during the S phase, the sister chromatids are separated and distributed into two daughter cells equally and faithfully. Also, each daughter cell receives the almost average and necessary intracellular constituents and organelles from the mother cell. Generally, cell division consists of six stages, including prophase, prometaphase, metaphase, anaphase, telophase and cytokinesis. And the first five stages constituent mitosis. During mitosis, numerous proteins organize protein super-complexes at the three distinct regions of centrosome, kinetochore/centromere and cleavage furrow/midbody. Although many proteins have been identified to be localized on centrosome, kinetochore and/or midbody, an integrated resource on this area still remains not to be available. In this work, we have collected all proteins identified to be localized on kinetochore, centrosome, and/or midbody from two fungi (S. cerevisiae and S. pombe) and five animals, including C. elegans, D. melanogaster, X. laevis, M. musculus and H. sapiens. From the related literature of PubMed, numerous proteins have been manually curated to be localized on at least one of the sub-cellular localizations of kinetochore, centrosome and midbody. And to promise the quality of data, based on the rationale of "Seeing is believing", these proteins have been unambiguously observed under fluorescent microscope as directly supportive evidences. Then an integrated and searchable database MiCroKit - Midbody, Centrosome and Kinetochore has been established. The MiCroKit database is the first integrative resource to pin point most of identified components and related scientific information of midbody, centrosome and kinetochore. The version 1.0 of MiCroKit database was set up on Nov. 2nd, 2005, containing 1,065 unique proteins. The MiCroKit version 2.0 was released on Jun. 5th, 2006, with 1,120 entries. Currently, the MiCroKit 3.0 database was updated on July 9, 2009, containing 1,489 unique protein entries. ... [Information of the supplier, modified]
SEVENS summarizes GPCR (G-protein coupled receptor) genes that are identified with high accuracy from 56 eukaryote genomes, by a pipeline integrating such software as a gene finder, a sequence alignment tool, a motif and domain assignment tool, and a transmembrane helix predictor. This treats a larger data space (than that in currently available other databases), which should include not only the expressed sequences but also the newly identified sequences that cannot be detected by in vivo experiments, although they definitely exist on the genome sequence and are just waiting for the opportunity to express their functions. SEVENS can provides the infrastructure of general information of "GPCR universe" for comparative genomics. ... [Information of the supplier]