Nuclc. Acids. Res. OUP
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH ARTICLES TABLE OF CONTENTS
Compilation Paper
Categories List
Alphabetical List
Search Summary Papers

HGVbase

http://hgvbase.cgb.ki.se

Fredman, D.1, Siegfried, M.1, Yuan, Y.P.2, Bork, P.2, Lehvaslaiho, H.3, Brookes, A.J.1

1Center for Genomics and Bioinformatics, Karolinska Institute, Sweden
2European Molecular Biology Laboratory, Germany
3European Bioinformatics Institute, United Kingdom

Contact   david.fredman@cgb.ki.se


Database Description

HGVbase (http://hgvbase.cgb.ki.se) is an effort to improve analysis of common human genetic polymorphism by maintaining a high quality and non-redundant database of all publicly available genomic variation data, primarily SNPs (Single Nucleotide Polymorphisms). HGVbase was first released in August 1998. The database is now being developed further by the European consortium listed above. The scope of HGVbase encompasses all forms of human polymorphism. There is no requirement for functional neutrality, thus HGVbase includes disease-causing clinical mutations as well as neutral polymorphisms. Each variant is presented in the context of its surrounding sequence and sequence features, and considerable efforts are made to relate polymorphisms to human genes. Population allele frequencies are included as they become available. All markers have a given position in a reference EMBL or Genbank sequence and have been uniquely mapped to the human draft genome. These positions are verified and updated in every new build of HGVbase. Extensions of the data structure have been made to capture haplotype information. As part of the MDI initiative [1] working to create a central repository for mutations, HGVbase will be extended to capture clinical relationships and phenotypic information. Data exchange with dbSNP (SNP repository at the NCBI) was established at the end of year 2000, helping to ensure full data coverage in subsequent HGVbase releases. Automated and manual data checking ensures internal consistency, and addresses errors sometimes present in the original source information. The high volumes of polymorphism data has prompted us to develop tools for automated annotation, some of which are publicly available through our webpages. ëHNP Blastí blasts query sequences against Genbank entries and extracts features from them to annotate SNPs. ëMutation Checkerí has been created to help researchers and database curators to verify the transcription and translation effects of DNA level sequence variation. Work performed to increase the utility of SNP information includes the provision of genotyping assays for every SNP, and functional predictions based upon various aspects of protein structure and amino acid conservation. Online search tools facilitate data interrogation by sequence similarity and by keyword queries. Core HGVbase data is also represented within the Ensembl project (www.ensembl.org) as sequence features with links to the HGVbase database for full information. User requests have prompted us to make continuous enhancements of the existing interfaces, such as the possibility to retrieve flanking sequences of arbitrary length for a given set of HGBASE polymorphism and search by dbSNP rs IDs. Ongoing developments include a redesign of the web interface with new search tools allowing search by genome coordinates. A full database description is provided online and accompanying web pages summarize SNP-related meetings, genotyping methods, SNP-databases, and analysis software. Dowloads of HGVbase are freely available in a variety of formats (XML, Fasta, SRS, SQL dumps and tagged-text files).

Recent Developments

ïDatabase size increased by orders of magnitude. ïDatabase rights transferred fully to the academic domain. ïPublicly available tools for SNP annotation ïExtension of the datastructure to store haplotype information. ïPubMed links and searchable MeSH terms. ïInterrogation by chromosomal positions, synchronized to releases of the Golden Path assembled genome. ïAs a part of the Mutation Database Initiative, HGVbase will be extended to capture clinical relationships and phenotypic information to serve as a central repository for mutations.

Acknowledgements

We thank Interactiva GmbH (Germany) for support during early development of the database and for transferring the project to the public domain.

REFERENCES

  1. Cotton, R. G., V. McKusick, et al. (1998) The HUGO Mutation Database Initiative. Science 279 (5347), 10-1.

Category   Mutation Databases

Go to the abstract in the NAR 2002 Database Issue.

 

Compilation Paper
Categories List
Alphabetical List
Search Summary Papers