GenBank

Compilation Paper

Categories List

Alphabetical List

Search Summary Papers

GenBank

http://www.ncbi.nlm.nih.gov/

Contact wheeler@ncbi.nlm.nih.gov

Database Description

GenBank (R) is a public database of all known nucleotide and protein sequences with supporting bibliographic and biological annotation, built and distributed by the National Center for Biotechnology Information (NCBI), a division of the National Library of Medicine (NLM), located on the campus of the US National Institutes of Health (NIH). As of Release 119 in August 2000, GenBank contained more than 9.5 billion nucleotide bases from over 8.2 million DNA sequences representing more than 70,000 different organisms. The GenBank sequences are derived primarily through direct submission of sequence data from individual laboratories and large-scale sequencing projects. Most submissions are made using the BankIt (Web) or Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Data exchange with the EMBL Data Library and the DNA Data Bank of Japan helps ensure comprehensive worldwide coverage. GenBank data is accessible through NCBI's integrated retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, and protein structure information plus the biomedical literature via PubMed. Sequence similarity searching is provided by the BLAST family of programs. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. NCBI also offers a wide range of World Wide Web retrieval and analysis services based on GenBank data. The GenBank database and related resources are freely accessible via the NCBI home page at http://www.ncbi.nlm.nih.gov

Category Major Sequence Repositories

Go to the abstract in the NAR 2002 Database Issue.