HILTON HEAD ISLAND, S.C. _ Since 1994 when Merck & Co.startled the pharmaceutical industry by giving gene sequenceinformation free of proprietary restraints to public data banks, thecompany and its partner, Washington University, of St. Louis, havegenerated partial sequences for more than 50,000 human genes.
As of last week, the Merck-Washington University collaboration haddeposited 366,592 expressed gene sequences, representing 50,605human genes.
By the end of this year, when the project is expected to conclude, itwill have contributed in two years more than 400,000 gene sequencesto the public data bases for study by scientists worldwide.
The sequences represent raw data about human genes and requireadditional research to define their biology. Only about 4,000 humangenes in the public data bases have been characterized for function.
An update of the Merck-Washington University project, which iscalled the Merck Gene Index, was presented in a poster exhibit at theEighth International Genome Sequencing and Analysis Conferencehere.
The meeting was hosted by The Institute for Genomic Research(TIGR), of Rockville, Md., a not-for-profit organization supportedfinancially by Human Genome Sciences Inc., of Rockville, whichbenefits from TIGR's work.
When Merck said it intended to give its genetic informationimmediately over to the public, some observers suggested it woulddiminish the commercial value of the proprietary gene sequence databanks created by Human Genome Sciences and IncytePharmaceuticals Inc., of Palo Alto, Calif. _ the two major firmsengaged in sequencing expressed human genes for drug discovery.
However, Human Genome Sciences and its major collaborator,London-based SmithKline Beecham plc, recently have negotiateddeals expanding access to the genetic data.
And Incyte, which sells non-exclusive access to its data bases, so farhas signed up 10 pharmaceutical company subscribers.
The gene sequence information from Merck, of Whitehouse Station,N.J., and Washington University, is deposited in GenBank, which isrun by the National Center for Biotechnology Information within theNational Institutes of Health in Bethesda, Md.
GenBank, together with the European Bioinformatics Institute in theU.K. and the DNA Data Bank in Japan, is the largest repository ofpublicly available genetic data.
A paper in the September issue of the journal Genome Research,published by Cold Spring Harbor Laboratory Press, in Cold SpringHarbor, N.Y., reported the Merck-Washington University project isresponsible for generating more than 75 percent of the human genesequences accessible in the public data bases.
At the four-day TIGR conference, which concluded Tuesday, theinstitute's director, Craig Venter, said a total of about 500,000expressed human gene sequences have been generated, representing80,000 human genes.
Estimates of the number of human genes usually settle around100,000, but some researchers, including those at Incyte, contend thegenome may contain upwards of 150,000.
In the next decade, Venter said, as more human genes and those ofother organisms _ from the mouse to various microbes _ aresequenced, researchers will have an estimated 400,000 to 500,000new genes to compare and characterize in efforts to understandhuman evolution and the mechanisms of diseases.
Venter and his colleagues observed the challenge has just begun indiscovering the biological function of genes and proteins within thecontext of the whole genome.
The explosion of data and the need to characterize the geneticmaterial quickly have forced researchers to invent their ownautomated tools.
For example, Ronald Davis, a professor in Stanford University'sDepartment of Biochemistry in Palo Alto, Calif., has pioneereddevelopment of what he emphasizes is a cost effective, high-throughput method for monitoring gene expression on a genome-widebasis. n
-- Charles Craig
(c) 1997 American Health Consultants. All rights reserved.