The European Nucleotide Archive (ENA) captures and presents information relating to experimental workflows that are based around nucleotide sequencing. A typical workflow includes the isolation and preparation of material for sequencing, a run of a sequencing machine in which sequencing data are produced and a subsequent bioinformatic analysis pipeline. ENA records this information in a data model that covers input information (sample, experimental setup, machine configuration), output machine data (sequence traces, reads and quality scores) and interpreted information (assembly, mapping, functional annotation). Data arrive at ENA from a variety of sources. These include submissions of raw data, assembled sequences and annotation from small-scale sequencing efforts, data provision from the major European sequencing centres and routine and comprehensive exchange with our partners in the International Nucleotide Sequence Database Collaboration (INSDC). ENA is made up of a number of distinct databases that includes EMBL-Bank, the newly established Sequence Read Archive (SRA) and the Trace Archive each with their own data formats and standards. ... [Information of the supplier]
DNAtraffic database is dedicated to be an unique comprehensive and richly annotated database of genome dynamics during the cell life. DNAtraffic contains extensive data on the nomenclature, ontology, structure and function of proteins related to control of the DNA integrity mechanisms such as chromatin remodeling, DNA repair and damage response pathways from eight model organisms commonly used in the DNA-related study: Homo sapiens (human), Mus musculus (mouse), Drosophila melanogaster (fruit fly), Caenorhabditis elegans (nematode), Saccharomyces cerevisiae (budding yeast), Schizosaccharomyces pombe (fission yeast), Escherichia coli K-12, Arabidopsis thaliana (mouse-ear cress) DNAtraffic contains comprehensive information on diseases related to the assembled human proteins. Database is richly annotated in the systemic information on the nomenclature, chemistry and structure of the DNA damage and drugs targeting nucleic acids and/or proteins involved in the maintenance of genome stability. One of the DNAtraffic database aim is to create the first platform of the combinatorial complexity of DNA metabolism pathway analysis. Database includes illustrations of pathway, damage, protein and drug. Since DNAtraffic is designed to cover a broad spectrum of scientific disciplines it has to be extensively linked to numerous external data sources. Database represents the result of the manual annotation work aimed at making the DNAtraffic database much more useful for a wide range of systems biology applications. DNAtraffic database is freely available and can be queried by the name of DNA network process, DNA damage, protein, disease, and drug. ... [Information of the supplier]
The Multi-purpose Automated Genome Project Investigation Environment (MAGPIE, http://magpie.ucalgary.ca/) is a software package for the automated curation and presentation of DNA and protein sequences. MAGPIE distills the results of multiple database searches for each sequence (e.g. expressed sequence tags, ESTs) or subsequence (e.g. open reading frames, ORFs, in bacterial genomic DNA) into a summary page that facilitates interpretation of genes' biological roles and origins. Database search tool used with MAGPIE include the standard NCBI-BLAST set of searches, plus more sensitive Smith-Waterman searches for frameshift prone data such as ESTs. Hidden Markov Model searches are also performed, including against an in-house HMM library derived from the NCBI's Clusters of Orthologuous Genes (COGs) database, including eukaryotic and prokaryotic clusters. HMM searches often provide single, concise, and accurate data for functional assignment. ... [Information of the supplier]
This page allows you to test an antibody sequence against the Kabat sequence database. Any unusual residues (occurring in < 1% of chains in the database) will be reported to you. This allows the identification of potential cloning artifacts and sequencing errors. The current Kabat database contains 6014 light chains and 7895 heavy chains. ... [Information of the supplier]
CATH is a hierarchical classification of protein domain structures, which clusters proteins at four major levels, Class(C), Architecture(A), Topology(T) and Homologous superfamily (H).Class, derived from secondary structure content, is assigned for more than 90% of protein structures automatically. Architecture, which describes the gross orientation of secondary structures, independent of connectivities, is currently assigned manually. The topology level clusters structures into fold groups according to their topological connections and numbers of secondary structures. The homologous superfamilies cluster proteins with highly similar structures and functions. The assignments of structures to fold groups and homologous superfamilies are made by sequence and structure comparisons. The boundaries and assignments for each protein domain are determined using a combination of automated and manual procedures. These include computational techniques, empirical and statistical evidence, literature review and expert analysis. ... [Information of the supplier]
MOLMOL is a molecular graphics program for displaying, analyzing, and manipulating the three-dimensional structure of biological macromolecules, with special emphasis on the study of protein or DNA structures determined by NMR. The program runs on UNIX and Windows NT/95/98/2000 and is freely available. [Information of the supplier]
The Molecular Genetics Explorer is a BioQUEST software simulation that integrates genetics, biochemistry, and molecular biology to study a biological phenomenon. It is designed to show students the connections between these three key disciplines of modern molecular genetics. It is based on "Botstein's Triangle". [Information of the supplier]