By Jim Shrine
BOSTON - The public consortium working on the Human Genome Project reached the two-thirds mark in its effort to sequence the 3 billion base pairs that make up the human genetic sequence, and said it is on target for completing a working draft in late May or early June.
The Sangre Center in the UK last week deposited the 2 billionth base pair - a "T" - into GenBank, the public database of DNA sequences run by the National Institutes of Health. The effort has picked up considerable speed over the past year due to advances in sequencing instruments and automation techniques. The 1 billion mark was crossed Nov. 23.
"In another 15 years, we'll see the full flower of genome-based drugs," Francis Collins, director of the NIH's National Human Genome Research Institute, told reporters Wednesday at the BIO 2000 conference being held here. He predicted the major contributors to illness will be discovered in the next five to seven years, and in 10 years diagnostic applications of the research will be standard practice in medicine.
The issue of genetic discrimination first must be resolved, however, before the diagnostic applications can be fully realized, he cautioned.
The 2.18 billion unique base pairs now in GenBank have been mapped to their locations on the 24 human chromosomes. The working draft of 3 billion base pairs will include 90 percent of the human DNA sequence, with an accuracy of 99.9 percent, Collins said.
The draft will have gaps, he said, but those should be filled in a "couple more years," ahead of the 2003 schedule. All goals have been reached on or ahead of schedule and, at a cost of about $250 million, under budget, Collins said.
Scientists from 16 institutions around the world deposit their work into GenBank, which is accessible via the Internet. Collins said GenBank is getting about 50,000 hits per day from public and private scientists doing basic and disease-related research.
"This is the feeder layer for the wonderful industry called biotechnology," Collins said.
The scientists are annotating their DNA sequence with the information about locations of specific genes and genetic variants, or single nucleotide polymorphisms (SNPs), to better understand specific diseases and disorders.
Collins said the public researchers continue to explore ways to cooperate with their private counterparts at Celera Genomics, an effort that was set back recently because of the different goals of the two groups, with Celera wanting to retain some control of its findings while the Human Genome Project is designed to provide unlimited public access. However, he said, "far too much is being made" in calling this a race between private and public enterprise.
He also acknowledged that issues surrounding international intellectual property "is a problem of harmonization that hasn't been solved."
The consortium includes institutions in France, Germany, Japan, China, Great Britain and the U.S. Those generating the most sequences are Baylor College of Medicine in Houston; Washington University School of Medicine in St. Louis; the Whitehead Institute in Cambridge, Mass.; the Joint Genome Institute in California, and the Sangre Center. The National Human Genome Research Institute funds the sequencing centers at Baylor, Washington University and the Whitehead Institute.