By David N. Leff

Lung cancer is the leading cause of cancer death worldwide, for both men and women. It kills more people than the next five more common malignancies combined.

Two distinct cancer-causing neoplasms assail the lungs: squamous-cell carcinoma, which burrows into the central interior of the lung, and adenocarcinoma, which attacks the pulmonary periphery. Squamous cell tumors are associated with tobacco smoking ¿ probably reflecting a response by the bronchial epithelium to carcinogenic insults from the environment. In contrast, lung adenocarcinomas are associated with immune-defense responses in the smallest airways of the respiratory system. But in a lung cancer patient, these twin perpetrators are difficult to distinguish and diagnose.

In a collaborative study enlisting government, academia and industry, researchers announced that they were able to distinguish the patterns of gene expression between the squamous and adeno types of lung cancer. Their report appears in the Proceedings of the National Academy of Sciences (PNAS), dated Dec. 18, 2001. The paper¿s title: ¿Molecular characteristics of non-small-cell lung cancer.¿ Its senior author is cancer geneticist Jin Jen, a principal investigator in the National Cancer Institute¿s Laboratory of Population Genetics in Bethesda, Md.

Medical oncologist David Sidransky at Johns Hopkins Medical School in Baltimore is a co-author. ¿The important thing about this PNAS paper,¿ he told BioWorld Today, ¿is that it really subdivides lung cancer, from a biological point of view, into something that we have clinically recognized for many years. Therefore, it obviously provides a biologic basis for the differences we¿ve seen both clinically and pathologically. Its novelty of course,¿ Sidransky added, ¿is using SAGE expression analysis to look at a whole array of gene sequences out of the one or two genes that may differ between the two tumor types.¿

Jin Jen explained SAGE, or Serial Analysis of Gene Expression:

¿SAGE is a way to identify and quantitate short segments of transcripts ¿ the DNA-to-messenger RNA ¿ en route to making a protein. SAGE allowed us to quantitate the message in the tumor cell¿s genome, in the clinical system of our interest. It did so by isolating the short fragment ¿ about 10 to 15 base pairs ¿ from each of the transcripts. Because some genes are very large, if we wanted to analyze each of them, it would have taken a lot of time sequencing. When we had the transcripts cut into pieces, we ligated them together to form a long chain. Then we looked for varying fingerprints that could trace back to the original genes.¿

SAGE For Gene Discovery, Expression, Comparison

¿We did SAGE,¿ Jin Jen continued, ¿using different lung cancer tissues, and normal ones. All together, we analyzed nine different samples, which generated nine libraries. Each is a summary or collection of all these little SAGE tags, isolated from the transcripts, each of which had one tag.

¿We did that on nine different tissues,¿ she recounted, ¿four of which were from the primary tumor ¿ two adeno, two squamous. Then we compared these lesions with four healthy tissues. From these specimens, we generated nine SAGE tissue culture libraries, and sequenced them to find out how many different tags were in each library. So gene A may have five tags in tumor A, 10 in tumor B in the squamous-cell category, 50 in adenocarcinoma tumors, and zero in the normal. This summary recovered about 20,000 different genes.¿

SAGE provider Genzyme Molecular Oncology, of Framingham, Mass., Jin Jen observed, ¿is one of our collaborators. They did the sequencing. We provided the tissues, made the RNA, and they prepared and sequenced the libraries, and made the tags.

¿Then we systemically analyzed that whole list of 20,000 genes, using hierarchical clustering. This is a logical way of organizing the tissue samples and genes, based on their difference of expression. It looks mathematically at every gene across each of the library samples, then summarizes their expression pattern.

¿When we used this clustering the first time, 4,000 genes came up. Wow! They were telling us distinctively that the normal tissues had a different expression pattern than tumor tissues. Subsequently we said: 4,000 doesn¿t save anybody¿s life.¿ So then we used a statistical method to discover that 115 genes could differentiate between the two tumor types and healthy tissue.¿

¿What this does,¿ Sidransky observed, ¿is set the framework for identifying a pattern of genes that could eventually be used in diagnostic or therapeutic approaches. What we¿re talking about here is the basic discovery area, which needs to be followed up by development, and identifying a panel of markers for diagnosis, or a pathway for prognosis and therapy.¿

Revolutionizing Diagnosis, Prognosis, Therapy

¿I think we are still at an early stage,¿ Jin Jen resumed, ¿correlating the biology, which is at the level of gene expression, with clinical phenotypes. Ultimately, perhaps in the next five years, we¿ll be finding out that diagnosis of lung cancers can no longer be limited to squamous,¿ adeno¿ or poorly differentiated.¿ We will come up with another panel of gene expression markers, which will contribute to how well the patient will do in the clinical setting. It¿s still too soon right now. We need clinical data to tell us, This patient has this clinical outcome, based on his or her gene expression level. And that other patient has that clinical outcome based on the gene expression level.¿

¿That will require much more extensive research,¿ she pointed out, ¿but the study we did now provides a picture that this is possible. You need just a very few genes, which are able to tell you the tumor cell distinction. So far, we know genetically that adeno and squamous have different gene mutations. Clinically, they are different in their histology, location, presentation. Meanwhile, lung cancer unfortunately is one of the malignancies that is not quite treatable as yet. So in terms of prognosis, they are all very poor. Right now they both have the same type of survival.

¿Next, we would like to conduct a systemic study using several hundred patients. That will allow us to correlate genotypes with clinical behavior. In the next five years ¿ maximally 10 to be safe ¿ that clinical diagnosis will be revolutionized,¿ she predicted, ¿and I think that¿s a message to take home.¿