Investigators at Edinburgh University have discovered that the stem cell transcription factor SALL4 (SAL-like 4) can recognize and bind to short AT-rich motifs across the genome and downregulate genes that are present in AT rich regions. SALL4 is a multi-zinc-finger protein that restrains differentiation of pluripotent embryonic stem cells (ESCs) and participates in several physiological processes, including neuronal development. The team reported its results in the January 5, 2021, online issue of Molecular Cell.

Senior author Adrian Bird, who is professor of genetics at Edinburgh University, told BioWorld Science that "the study was born out of an idea that proteins which recognize short DNA sequences right might regulate gene expression by changing the epigenome. And we had this thought that maybe such proteins could read base composition, which is known to vary in domains across the genome and our DNA pull-down screens in mouse ESCs identified SALL4."

Bird said that though it had been known for a long time that the genome is organized into regions of relatively homogenous base composition, no one was sure whether this was an accidental byproduct of evolution or there was some functional relevance. "Our work is the first study to suggest that base composition can actually be used as a signal to regulate gene expression."

Bird and his team show that SALL4 targets a broad range of AT-rich motifs via the zinc-finger cluster ZFC4. Bird said that "while the C-terminal ZFC4 domain binding to AT-rich repetitive DNA has been reported previously, its biological significance was unknown. Our study demonstrates that ZFC4 is a key domain mediating SALL4 biological function in mouse ESCs. So, in the absence of that domain, but in the presence of the rest of the protein, you get precocious differentiation of stem cells toward the neuronal lineage." AT-rich genes that are repressed by SALL4 in ESCs are activated soon after loss of pluripotency. SALL4 contains two other C2H2 (cys-2 His -2) zinc finger clusters, ZFC1 and ZFC2, in addition to ZFC4. ZFC1 and ZFC2 were found to be dispensable for SALL4 function in mouse ESCs.

SALL4 is downregulated in most adult tissues except germ cells. However, SALL4 is known to be highly expressed in several cancers contributing to advanced, higher grade, and resistant disease types. SALL4 upregulation in cancer can be a potential link between pluripotency and cancer and thus targeted therapeutically with limited side effects.

In a study published in Cell Reports on January 6, Daniel Tenen and Li Chai also show that SALL4 binds an AT-rich DNA sequence through ZFC4. SALL4 was seen to bind and repress members of the histone lysine demethylase (KDM) family of genes, resulting in changes in the methylation status of H3K9 and chromatin. Tenen and Chai also identified novel SALL4 targets in liver cancer that included chromatin modifiers like KDM3A and other transcription factors like Forkhead (FOXO and FOXA1) and KLF.

SALL4 knockdown increased KDM3A, thus demonstrating the potential for SALL4 to act as a regulator of the global chromatin landscape. Li Chai is a transfusion medicine specialist and associate professor of pathology at Harvard Medical School while Tenen is a professor of medicine at Harvard Medical School and Director of the Cancer Science Institute of Singapore. They have been studying the SALL4 gene for a while now and have identified several interacting proteins for SALL4 and pharmacological inhibitors for the same.

Bird said that "the Cell Reports paper agrees very well with our finding that SALL4 acts as a transcriptional regulator via DNA binding to AT rich motifs. Obviously, cancer cells are dependent for their survival on the fact that they can avoid terminal differentiation and SALL4 drives the pluripotent phenotype."

Commenting on Tenen's study, Bird further said that "the Cell Reports study confirmed that AT-rich motif recognition was essential for maintaining the cancerous cells. These findings prove our hypothesis that base compositional domains are not merely a biologically irrelevant byproduct of genome evolution but are biochemically relevant."

He added that there were probably several other proteins that acted in a similar manner and that their study showed that "regulation of gene expression by conventional transcription factors which bind to promoters and enhancers may actually depend on binding throughout the genome creating an environment surrounding a genetic locus that encourages his work or discourages gene expression."

Bird now plans to expand the scope of his work to see how SALL4 functions in cells other than ESCs as also look for other proteins that can interpret differential DNA base composition, and "instead of stabilizing pluripotency may stabilize differentiation."