When regular celebrities leak personal information to the media, those leaks usually are accompanied by strenuous denials and protestations about purported privacy violations. But Craig Venter is showing how it's done in science. In the scientific equivalent of a tell-all biography, his individual genome - complete with a poster - has been published in the Sept. 4, 2007 issue of PLoS Biology.
Two draft sequences of the human genome were first published in 2001, one by a public consortium led by National Human Genome Research Institute Director Francis Collins and one by biotech company Celera Genomics, where Venter was president and chief scientific officer at the time.
But those sequences were so-called consensus sequences, based on roughly 10 individuals, in the case of the public consortium sequence, and five individuals for the Celera version. Eric Lander, one of the leaders of the public project, said at the time that "while a number of different people were sequenced, the majority contributor from the international human project was an anonymous guy from Buffalo, N.Y." while Dr. Venter's genome contributed about 60 percent of Celera's sequence.
In their PLoS Biology paper, the scientists - who are from the Hospital for Sick Children and the University of California at San Diego, and the Universitat de Barcelona in addition to the J. Craig Venter Institute - used a combination of sequencing methods to sequence both copies of Venter's chromosomes. While the maternal and paternal chromosomes are not absolutely separated by the method, stretches coming purely from one or the other parent are hundreds of thousands of base pairs long, which allows maternal and paternal chromosomes to be compared.
The researchers reported that the comparison of maternal and paternal chromosomes, as well as the comparison of the genome to Celera's and the public consortium's consensus sequences, showed that humans are more genetically variable than either of the consensus sequences showed. From their data, the researchers estimate one in 200 DNA bases differ between individual genomes. Previous estimates had put the difference closer to one in a thousand. Almost half of the genes in Venter's genome have at least one differing nucleotide in the maternal and paternal variant.
The diploid sequence also showed that types of genetic variations besides SNPs, which have received less research attention than single nucleotide polymorphisms to date, account for more genetic variation that has been recognized. In fact, though SNPs outnumber other kinds of variants, the majority of base pair variation does not come from SNPs because variations like insertions and deletions can be several nucleotides long.
In a video posted to the J. Craig Venter Institute web site, sound reasons for using Venter's genome sequence were given, including the fact that he does not mind it being in the public domain. Venter said that "At Celera, one of the reasons I volunteered to have my genome sequenced was to show leadership in a field where everybody thought that it was a very high-risk event to have your genome sequenced. I have no difficulty having mine on the Internet, not only because it's difficult to interpret, but I think it's because there are very few incidences in human genetics where there's a direct causal event in some absolute deterministic fashion between genetics and a trait. Almost everything we have is some sort of tradeoff between genetics and the environment."
But the personal genome also may be turning into a must-have accessory in the scientific community. Double helix co-discoverer James Watson (who, like Venter, does not appear averse to the fame his discovery has brought him) was presented with his personalized genome in May. Indeed, in the video, PLoS Biology lead author Samuel Levy said that comparing the two genomes would be "a very interesting scientific endeavor."