ProGlycProt (Prokaryotic Glycoproteins) is a manually curated, comprehensive repository of experimentally characterized bacterial glycoproteins and archaeal glycoproteins, generated from an exhaustive literature search. This is the focused beginning of an effort to provide concise relevant information derived from rapidly expanding literature on prokaryotic glycoproteins, their glycosylating enzyme(s), glycosylation linked genes, and genomic context thereof, in a cross-referenced manner. ProGlycProt is an extensive online collection of experimentally verified glycosites and glycoproteins of the prokaryotes. For users’ benefit, the database under menu ProGlycProtdb is arranged into two sections namely, ProCGP and ProUGP. ProCGP is the main section containing characterized prokaryotic glycoproteins, defined as entries with at least one experimentally known "glycosylated residue (glycosite)". Whereas, ProUGP is the supplementary section, presenting uncharacterized prokaryotic glycoproteins, defined as entries with experimentally identified glycosylation but unidentified glycosites. The ProGlycProt has been developed with an aim to aid and advance the emerging scientific interests in understanding the mechanisms, implications, and novelties of protein glycosylation in prokaryotes that include many pathogenic as well as economically important bacterial species. A general data update policy is once in three months. Existing entries are updated in real-time. ... [Information of the supplier]
There are two types of acetylation processes widely occurred in proteins. The first Nα-terminal acetylation is catalyzed a variety of N-terminal acetyltransferases (NATs), which cotranslationally transfer acetyl moieties from acetyl-coenzyme A (Acetyl-CoA) to the α-amino (Nα) group of protein amino-terminal residues. Although Nα-terminal acetylation is rare in prokaryotes, it was estimated that about 85% of eukaryotic proteins are Nα-terminally modified. The second type is Nε-lysine acetylation, which specifically modifies ε-amino group of protein lysine residues. Although Nε-lysine acetylation is less common, it's one of the most important and ubiquitous post-translational modifications conserved in prokaryotes and eukaryotes. Moreover, the acetylation and deacetylation are dynamically and temporally regulated by histone acetyltransferases (HATs) and histone deacetylases (HDACs), respectively. Since the number of known acetylation sites is rapidly increased, it is an urgent topic to collect the experimental data and provide an integrated resource for the community. Recently, several public databases, such as PhosphoSitePlus, HPRD, SysPTM, and dbPTM, have already contained protein acetylation information. In these databases, both of Nα-terminal and Nε-lysine acetylation data were curated, while lysine acetylation sites are usually only a limited part of total sites. So, thousands of lysine acetylation sites in other species still remain to be collected. Currently, the CPLA 1.0 database was updated on March 1st, 2010, containing 3,311 unique protein entries with 7,151 lysine acetylation sites. The online service of CPLA 1.0 was implemented in PHP + MySQL + JavaScript. And the local packages of CPLA 1.0 were developed in JAVA 1.5 (J2SE). The database will be updated routinely as new acetylated lysines are reported. ... [Information of the supplier, modified]
No other database than ESTHER holds all alpha/beta hydrolase fold proteins together: Interpro, Prosite, Pfam, have multiple entries for subsets of this structural superfamily. A table Synthese shows the correspondance between these database entries and the subfamilies in ESTHER. The ESTHER Table is now a little to big to be usefull. Each file contains one of the 31219 non redundant proteins/genes. The tables grouped in the family table, the syntheses table or the structure table may be more usefull. The Gene_locus nomenclature for these non-redundant entries is a name with 5 characters for the organisms (3 for genera, 2 for the species, except when a common 5 character name exists. ex: ratno is for Rattus norvegicus and human for man. This allows us to keep close to the Swiss-Prot nomenclature). The last characters define the protein, ex: human-acche represents human acetylcholinesterase. ... [Information of the supplier]
SUPERFAMILY is a database of structural and functional annotation for all proteins and genomes. The SUPERFAMILY annotation is based on a collection of hidden Markov models, which represent structural protein domains at the SCOP superfamily level. A superfamily groups together domains which have an evolutionary relationship. The annotation is produced by scanning protein sequences from over 2,414 completely sequenced genomes against the hidden Markov models. For each protein you can: a) Submit sequences for SCOP classification; b) View domain organisation, sequence alignments and protein sequence details For each genome you can: a) Examine superfamily assignments, phylogenetic trees, domain organisation lists and networks; b) Check for over- and under-represented superfamilies within a genome For each superfamily you can: a) Inspect SCOP classification, functional annotation, Gene Ontology annotation, InterPro abstract and genome assignments; b) Explore taxonomic distribution of a superfamily across the tree of life All annotation, models and the database dump are freely available for download to everyone. SUPERFAMILY is a member of the InterPro consortium of protein annotation databases, and has been integrated into the Ensembl eukaryotic genome project and The Arabidopsis Information Resource. To date, the SUPERFAMILY publications have been cited over 1,000 times. SUPERFAMILY has been used in structural, functional, evolutionary and phylogenetic research projects. ... [Information of the supplier]
M phase, also called as cell division, is the most crucial and fundamental affair of the eukaryotic cell cycle. After the chromosomes have been replicated during the S phase, the sister chromatids are separated and distributed into two daughter cells equally and faithfully. Also, each daughter cell receives the almost average and necessary intracellular constituents and organelles from the mother cell. Generally, cell division consists of six stages, including prophase, prometaphase, metaphase, anaphase, telophase and cytokinesis. And the first five stages constituent mitosis. During mitosis, numerous proteins organize protein super-complexes at the three distinct regions of centrosome, kinetochore/centromere and cleavage furrow/midbody. Although many proteins have been identified to be localized on centrosome, kinetochore and/or midbody, an integrated resource on this area still remains not to be available. In this work, we have collected all proteins identified to be localized on kinetochore, centrosome, and/or midbody from two fungi (S. cerevisiae and S. pombe) and five animals, including C. elegans, D. melanogaster, X. laevis, M. musculus and H. sapiens. From the related literature of PubMed, numerous proteins have been manually curated to be localized on at least one of the sub-cellular localizations of kinetochore, centrosome and midbody. And to promise the quality of data, based on the rationale of "Seeing is believing", these proteins have been unambiguously observed under fluorescent microscope as directly supportive evidences. Then an integrated and searchable database MiCroKit - Midbody, Centrosome and Kinetochore has been established. The MiCroKit database is the first integrative resource to pin point most of identified components and related scientific information of midbody, centrosome and kinetochore. The version 1.0 of MiCroKit database was set up on Nov. 2nd, 2005, containing 1,065 unique proteins. The MiCroKit version 2.0 was released on Jun. 5th, 2006, with 1,120 entries. Currently, the MiCroKit 3.0 database was updated on July 9, 2009, containing 1,489 unique protein entries. ... [Information of the supplier, modified]
The UniPROBE (Universal PBM Resource for Oligonucleotide Binding Evaluation) database hosts data generated by universal protein binding microarray (PBM) technology on the in vitro DNA binding specificities of proteins. This initial release of the UniPROBE database provides a centralized resource for accessing comprehensive data on the preferences of proteins for all possible sequence variants ('words') of length k ('k-mers'), as well as position weight matrix (PWM) and graphical sequence logo representations of the k-mer data. In total, the database currently hosts DNA binding data for 406 nonredundant proteins from a diverse collection of organisms, including the prokaryote Vibrio harveyi, the eukaryotic malarial parasite Plasmodium falciparum, the parasitic Apicomplexan Cryptosporidium parvum, the yeast Saccharomyces cerevisiae, the worm Caenorhabditis elegans, mouse, and human. The database's web tools (on the right) include a text-based search, a function for assessing motif similarity between user-entered data and database PWMs, and a function for locating putative binding sites along user-entered nucleotide sequences. Please click on each tool's "help" link for more information. ... [Information of the supplier]
The purpose of this database is to represent the relationship between protein structural change and ligand binding. We classified protein structural changes into 7 classes blow, in terms of the ligand binding sites and the location where the dominant motion occurs. [Information of the supplier]
Welcome to ProRepeat - the database of protein repeats! Amino acid tandem repeats, as one of the most prevalent patterns in protein sequences, have inspired the interests of researchers for many years in terms of their pathological, functional, and evolutionary roles. To provide the user community with a useful resource, we have constructed this online database of protein repeat sequences. The latest dataset of ProRepeat is UniProtKB Release May 2011 and RefSeq Release 40. ProRepeat also gathers the corresponding nucleotide sequences of the repeat fragments for the purpose of codon usage analysis. ... [Information of the supplier]
The database of protein-chemical structural interactions includes all existing 3D structures of complexes of proteins with low molecular weight ligands. When one consideres the proteins and chemical vertices of a graph, all these interactions form a network. Biological networks are powerful tools for predicting undocumented relationships between molecules. The underlying principle is that existing interactions between molecules can be used to predict new interactions. For pairs of proteins sharing a common ligand, we use protein and chemical superimpositions combined with fast structural compatibility screens to predict whether additional compounds bound by one protein would bind the other. ... [Information of the supplier]
TopFIND is the Termini oriented protein Function Inferred Database (TopFIND) is an integrated knowledgebase focused on protein termini, their formation by proteases and functional implications. It contains information about the processing and the processing state of proteins and functional implications thereof derived from research literature, contributions by the scientific community and biological databases. TopFIND has been designed and developed by Philipp F. Lange in the research group of Christopher M. Overall at the Center for Blood Reaserch, University of British Columbia. ... [Information of the supplier]