Notes on BLAST database names used in GCG

The BLAST databases available for use in GCG are:

Nucleotide BLAST databases

genbank - the "classic", or taxonomic divisions of genbank (bacterial, environmental, invertebrate, other_mammalian, patent, plant, primate, rodent, STS, unannotated, viral, other_vertebrate, and gb_new - the nightly cumulative update division). This BLAST database is rebuilt nightly to include the NCBI daily update to GenBank.

genbank_htg - actually, this BLAST database includes both the High Throughput Genome (htg) and High Throughput cDNA (htc) divisions. This BLAST database is rebuilt bi-monthly with each new full release of GenBank.

genbank_notags - a BLAST set that combines the contents for genbank and genbank_htg. This BLAST database is rebuilt nightly to include the NCBI daily update to GenBank.

genbank _est - the Expressed Sequence Tags division. This BLAST database is rebuilt bi-monthly with each new full release of GenBank.

genbank_gss - the Genome Survey sequence division. This BLAST database is rebuilt bi-monthly with each new full release of GenBank.

genbank_tags - a BLAST set that combines the contents of genbank_est and genbank_gss. This BLAST database is rebuilt bi-monthly with each new full release of GenBank.

wgs - the NCBI Whole Genome Shotgun sequence division. This BLAST database is rebuilt weekly to include any updates to the wgs portion of the GenBank tree. The wgs division is maintained by ACHS as a BLAST database only. This division of GenBank consists primarily of unanotated, low pass genome sequence (see the 2006 Database issue of Nucleic Acids Research for details on the structure of GenBank).

hsapiens - the NCBI human genome construct. This BLAST database is rebuilt when significant changes are announced to the genome build.

mmusculus - the NCBI mouse genome construct. This BLAST database is rebuilt when significant changes are announced to the genome build.

yeast_genome - the NCBI yeast genome data set. This BLAST database is rebuilt when significant changes are announced to the genome build.

celegans - the C. elegans genome sequence. This BLAST database is rebuilt when significant changes are announced to the genome build.

drerio - the Zebra fish genome sequence. This BLAST database is rebuilt when significant changes are announced to the genome build.

ehistol - the E. histolitica genome. This BLAST database is rebuilt when significant changes are announced to the genome build.

Peptide BLAST databases

genpept - the translations of all ORFs in GenBank. This BLAST database is rebuilt nightly to include the NCBI daily update to GenBank.

refseqprotein - NCBI's protein reference sequence database. This BLAST database is rebuilt nightly to include the NCBI daily update to GenBank.

uniprot - the unified protein database. This BLAST database is rebuilt weekly to include the Uniprot weekly update.

swissprot - a legacy copy of the final independent release of SwissProt (ver. 44.2, of January, 2005). No longer updated

PIR - a legacy copy of the final release of Georgetown University's Protein Information Resource (ver. 80, of January, 2005). No longer updated

hs_pep - translations of the NCBI human genome. This BLAST database is rebuilt when significant changes are announced to the genome build.

mm_pep - translations of the NCBI mouse genome. This BLAST database is rebuilt when significant changes are announced to the genome build.

dr_pep - translations of the NCBI Zebra fish genome. This BLAST database is rebuilt when significant changes are announced to the genome build.

Please email comments or suggestions about the ACHS MolBiol pages to mblack@virginia.edu.

Academic Computing Health Sciences
Box 800555
Charlottesville, VA 22908
(434) 982-4025

© 2008 by the Rector and Visitors of the University of Virginia.

The information contained on the University of Virginia’s Department of Information Technology and Communication (ITC) website is provided as a public service with the understanding that ITC makes no representations or warranties, either expressed or implied, concerning the accuracy, completeness, reliability or suitability of the information, including warrantees of title, non-infringement of copyright or patent rights of others. These pages are expected to represent the University of Virginia community and the State of Virginia in a professional manner in accordance with the University of Virginia’s Computing Policies.