BioWorld International Correspondent

LONDON - An international consortium has launched an ambitious project to speed up the discovery of the variations in the human genome sequence that are involved in disease. The HapMap Project aims to map the alterations in the genomes of varied people, including Africans, East Asians and Europeans.

The human genome sequence that came out of the Human Genome Project told everyone about the identical 99.9 percent of the human genome that can be found in each of us. By contrast, the HapMap Project intends to study the 0.1 percent of the genome that differs from person to person.

The members of the International HapMap Consortium want to identify the segments of the genome that have remained largely unaltered by recombination events since ancient times, long before Homo sapiens emerged from Africa 100,000 years ago. They will log single-base variations, called single nucleotide polymorphisms, or SNPs, within those segments. The goal is to create a database that will allow researchers of the future to identify very quickly which SNPs belong to which blocks, or haplotypes, and, thus, which haplotypes are associated with various diseases.

An article by the consortium titled "The International HapMap Project" appears in the Dec. 18, 2003, edition of Nature.

David Bentley, head of human genetics at the Wellcome Trust Sanger Institute in Hinxton, UK, told BioWorld International: "The HapMap Project is very important as the next step to make the most use of the information from the Human Genome Project, and to apply it to common diseases. It is a first step in this direction, and it will represent a tremendous saving on time for researchers."

At the moment, it would be impossible to check whether every SNP in the genome of someone who has a particular disease is linked to the cause of that disease - that is too expensive and time consuming. However, once the HapMap is available, researchers will be able to use just a fraction of the SNPs to find out which disease sufferers have which haplotypes.

That is because the HapMap will tell scientists which SNPs are usually inherited together. As a result, the work involved in comparing the genomes of groups of people with a certain disease with those of unaffected controls, in order to detect associations between diseases and genetic variations, will be greatly reduced.

Initially, the groups comprising the consortium plan to put together a database of all the genomic variations that are known. That already has involved doing some additional sequencing of anonymous genomes. About 3 million such variations already were known before the start of the project, Bentley said, of which 90 percent are SNPs, while the remaining 10 percent are insertions or deletions of one or more bases. He expects the number of variants known throughout the genome to increase by a factor of about 2.5 by the time that phase of the project is complete.

Once the information is available, the next step will be to find out which segments of the genome are commonly inherited together. Bentley said: "Modern human populations outside Africa have relatively few common ancestors from the time that people first emerged from Africa before migrating across the globe and hugely multiplying in number. Because there simply has not been long enough for the segments of the ancestral chromosomes to be completely jumbled up by recombination, segments of the chromosomes of one or other of these relatively few ancestors reside in every one of our genomes today."

Once the consortium has identified the segments of ancestral origin, its members will pick a minimum number of SNPs that represent each of the segments. Researchers in the future will need to use only those "tag SNPs" to identify which segments their subjects have inherited in order to know which haplotypes those people have. They will be able to make the identification using genotyping assays, which the HapMap project will develop and assess. That type of test, which allows scientists to pinpoint which base occurs at a particular point in the genome, is much cheaper than sequencing large pieces of DNA.

To be representative, the consortium will study samples from people from the U.S. (of Northern and Western European ancestry), Nigeria, Tokyo and Beijing. Consultation exercises have taken place with the communities concerned to allay potential concerns about what the findings might show.

The consortium plans to share freely the data it generates. It also has stipulated that the combinations of genetic markers that are generated will not be patented.