Gabriel Kreiman
Identification
of sparsely distributed clusters of cis-regulatory elements in sets of co-expressed
genes.
Nucleic Acids Research,
32:2889-2900 (2004) PDF
Sequence
information and high-throughput methods to measure gene expression levels
open the door to explore transcriptional regulation using computational tools.
Combinatorial regulation and sparseness of regulatory elements throughout
the genome allow organisms to control the spatial and temporal patterns of
gene expression. Here we study the organization of cis-regulatory elements
in sets of co-regulated genes. We build an algorithm to search for combinations
of transcription factor binding sites that are enriched in a set of potentially
co-regulated genes with respect to the whole genome. No knowledge is assumed
about involvement of specific sets of transcription factors. Instead, the
search is exhaustively conducted over combinations of up to four binding sites
obtained from databases or motif search algorithms. We evaluate the performance
on random sets of genes as a negative control and on three biologically validated
sets of co-regulated genes in yeasts, flies and humans. We show that we can
detect DNA regions that play a role in the control of transcription. These
results shed light on the structure of transcription regulatory regions in
eukaryotes and can be directly applied to clusters of co-expressed genes obtained
in gene expression studies. Supplementary information is available at http://www.mit.edu/kreiman/resources/cisregul/.