[Jun 28, 2008 8:06] Email phishing scam targeting WebMail users has been reported. More Information
Release # |
Release or Upgrade date |
|
GenBank (gb_new division updated nightly |
Release 166.0 |
06/15/08 |
NCBI RefSeq, complete release (DNA & protein
- updated nightly) |
Release 27.0 |
01/06/08 |
GenPept (gp_new division updated nightly) |
Release 166.0 |
06/16/08 |
PIR_Protein - no longer updated - use UniProt instead |
Release 80 |
01/2005 |
SWISS-PROT - no longer updated - use Uniprot instead |
Release 44 |
01/2005
|
UniProt (both uniprot_swissprot and uniprot_trembl
- updated weekly)
UniProt is a combination of the information previously contained in the separate databases SwissProt, TrEMBL and PIR | 05/29/07 |
|
H. sapiens reference genome (build 36, version 2) |
NCBI constructs |
09/2006 |
M. musculus reference genome (build 36, version 1) |
NCBI constructs |
05/2006 |
D. rerio reference genome (build 1, version 1) |
NCBI constructs |
07/2005 |
C. elegans genome |
NCBI constructs |
01/2003 |
Yeast Genome Database |
original |
05/1998 |
Yeast - genbank entries (yeast_gb) |
N/A |
03/2006 |
Yeast - complete genome (yeast_chr) from NCBI RefSeq constructs |
N/A |
03/2006 |
Yeast Genome Peptide database |
original |
05/1998 |
Yeast - non-redundant protein entries (yeastpep_nr) |
N/A |
03/2006 |
Yeast ORF Database (no introns) |
N/A |
03/2006 |
PROSITE |
Release 18.26 |
05/11/04 |
Transcription Factor Database (no longer updated) |
3.4 |
06/01/99 |
Pfam |
21.0 |
11/2006 |
Restriction Enzyme |
REBASE 612 |
11/ 30/2006 |
notes on referencing databases from within GCG
notes on referencing BLAST databases within GCG
GenBankA copy of the GenBank DNA database is maintained on site. The database is updated daily.
- GenBank release notes
- Link to the GenBank master copy at NCBI
- The GenBank master copy at NCBI is also accessable through Entrez
RefSeq provides reference sequence standards for genomes, transcripts and proteins. Updated weekly.
GenPeptA protein database consisting of the translated protein sequences, contained in the GenBank sequence headers. Most sequences are hypothetical and not confirmed experimentally. The database is divided into the same subdivisions as GenBank. Updated daily.
UniProt *NEW*The Universal Protein Resource is a repository of protein sequence and function. The data in UniProt comes from combining the information contained in Swiss-Prot, TrEMBL, and PIR. It consists of two major sections - uniprot_sw (manually curated and annotated entries from SwissProt and PIR), and uniprot_tr (computer analyzed entries from TrEMBL and PIR, awaiting full manual curation and annotation).
Since UniProt is wholly redudant with PIR, eventually (ie. end of 2004, or early 2005) production/updates of the PIR protein sequence database will cease. UniProt is updated bi-weekly.
- The UniProt organization home page (and the uniprot site at ExPASy)
- Link to the release notes at ExPASy.org
A protein database. Discontinued in January 2005 - replaced with UniProt. Version 80 will remain available, but users should access UniProt for the most recent, and accurate protein sequences.
SWISSPROTA well annotated protein database.
As of January 2005, SwissProt has been replaced by UniProt. The last update of Swissprot was version 44.2 on January, 2005. This last version will remain available, but users should access UniProt for the most recent, and accurate protein sequences.
H. sapiensIndividual chromosome builds from a reference assembly for the whole genome.
- release notes (UVa maintains a copy of the Public Sequencing Consortium's reference build, not the Celera Genomics' alternative build)
- Link to the chromosome browser page at NCBI (the local database was built from the files available at NCBI's FTP site - see the "genomes" folder at ftp.ncbi.nih.gov)
Individual chromosome builds from a reference assembly for the whole genome (includes mitochondrial genes and unannotated fragments)
- release notes (UVa maintains a copy of the Public Sequencing Consortium's reference build, not the Celera Genomics' alternative build)
- Link to the chromosome browser page at NCBI (the local database was built from the files available at NCBI's FTP site - see the "genomes" folder at ftp.ncbi.nih.gov)
Individual chromosome builds from a reference assembly for the whole genome (includes mitochondrial genes and unannotated fragments)
- release notes
- Link to the chromosome browser page at NCBI (the local database was built from the files available at NCBI's FTP site - see the "genomes" folder at ftp.ncbi.nih.gov)
- Link to the master Zebrafish genome site at the Sanger Institute
The yeast_orf database has been updated and reinstated. This version of the Sanger orf database contains orf sequences with no introns.
Three new databases have been added locally: yeast_gb, yeastpep_nr, and yeast_chr. These are:
- The full contingent of GenBank entries taken from the complete yeast genome.
- The full contingent of non-redundant S. cerevisiae protein entries from SWISSPROT, PIR,, and GenPept.
- The complete yeast genome (16 chromosomes plus mtDNA), annotated by chromosome number.
Because the format of these new yeast databases differs slightly from our older databases, the original databases are still available:
- The original (and incomplete!) yeast genome nucleotide database can be searched (as before) from GCG through FASTA by specifying: yeast:*
- The original (and out-of-date!) yeast genome peptide database can be searched from GCG through FASTA by specifying: yeastpro:*, or yeastpep:*
- A link to the master Saccharomyces Genome Database. This site contains useful yeast-related topics.
The file containing all the restriction enzyme sites used in GCG can be downloaded with the GCG command: fetch enzyme.dat. The file is updated monthly.
Transcription factor databaseThe file containing all the transcription factor sites used in GCG can be downloaded with the GCG command: fetch tfsites.dat.
TRANSFAC is no longer available freely for updates/downloads. Hence, our local copy is very out of date.
For the most current version of TRANSFAC, please go to the BioBase Corp. web site
other Web-based TRANSFAC search engines
PFAM : Multiple alignments and profile HMMs of protein domainsPfam is a large collection of multiple sequence alignments and hidden Markov models covering many common protein domains. Version 7.2 of Pfam (April 2002) contains alignments and models for 3735 protein families, based on the Swissprot 40 and SP-TrEMBL 18 protein sequence databases. It is intended for use with the HMMERSEARCH program in GCG.
- Pfam release notes
- The Pfam site at Washington University, St. Louis
- The Pfam site at the Sanger Centre
- The ProSite HomePage at ExPASy
Please email comments or suggestions about the ACHS MolBiol pages to mblack@virginia.edu.
Academic Computing Health SciencesBox 800555
Charlottesville, VA 22908
(434) 982-4025
