Confirmant Ltd. developed what it says is the first comprehensive database of human proteins, the Protein Atlas of the Human Genome, for use by pharmaceutical and biotechnology companies in drug discovery efforts.
“We’re devising the protein sequence information and mapping that back to the genome,” said Jonathan Sheldon, chief technology officer for Confirmant, based in Abingdon, UK.
Confirmant, a joint venture of Oxford GlycoSciences plc and Marconi plc, is identifying protein sequence tags, or PSTs, that can be mapped back to the human genome to unambiguously define gene structure. The venture integrates Marconi’s broadband data transmission and hosting capabilities and OGS’s proteome database.
The program will be made available in June, but at that point only about 10,000 genes will be completed. The company expects that the entire number of known genes about 30,000 will be completed by June 2003.
Sheldon pointed out that each gene gives rise to many different protein forms. It is estimated that each gene has five to 10 different isoforms. With Protein Atlas, researchers can understand the molecular basis of disease, he said.
The database is being presented at the Genome Tri-Conference 2002 in Santa Clara, Calif.
The company’s current business model rests on pharmaceutical and biotechnology companies gaining access to the database on a subscription basis, said David Palmer, vice president of business development for Confirmant. It does not include internal drug discovery efforts.
“What we plan is that we’ll make the data available in a variety of formats, and [customers] will be able to incorporate that into their own bioinformatics infrastructure,” Sheldon said.
The Protein Atlas can be used to identify novel targets and biomarkers; improve annotation to facilitate validation of existing targets; for compound screening; for positional cloning and association studies; for microarray design; and for protein-centric organization of biological information.
Sheldon said he’s not aware of any other company that’s taking this approach.
“There’s obviously other proteomics companies, but none that are commercializing the output the way we are,” Palmer said.
However, the technology does require high-throughput proteomics expertise, and the information in the Protein Atlas is experimentally derived using technologies developed by Oxford GlycoSciences, of Oxford, UK.
Most of the genomic databases currently available are based on computational algorithms, which over- and under-predict genes, Sheldon said, meaning that in some cases these programs suggest the presence of genes when they are not there, and do not recognize them at times when they are available.