Please use this identifier to cite or link to this item:

TitleComputing and applying atomic regulons to understand gene expression and regulation
Author(s)Faria, J.
Davis, James J.
Edirisinghe, Janaka N.
Taylor, Ronald C.
Weisenhorn, Pamela
Olson, Robert D.
Stevens, Rick L.
Rocha, Miguel
Rocha, Isabel
Best, Aaron A.
DeJongh, Matthew
Tintle, Nathan L.
Parrello, Bruce
Overbeek, Ross
Henry, Christopher S.
Keywordsatomic regulon
gene expression analysis
transcriptomic data
Escherichia coli
hierarchical clustering
k-means clustering
Issue date24-Nov-2016
PublisherFrontiers Media
JournalFrontiers in Microbiology
CitationFaria, J.; Davis, James J.; Edirisinghe, Janaka N.; Taylor, Ronald C.; Weisenhorn, Pamela; Olson, Robert D.; Stevens, Rick L.; Rocha, Miguel; Rocha, Isabel; Best, Aaron A.; DeJongh, Matthew; Tintle, Nathan L.; Parrello, Bruce; Overbeek, Ross; Henry, Christopher S., Computing and applying atomic regulons to understand gene expression and regulation. Frontiers in Microbiology, 7(1819), 1-14, 2016
Abstract(s)Understanding gene function and regulation is essential for the interpretation prediction and ultimate design of cell responses to changes in the environment. An important step toward meeting the challenge of understanding gene function and regulation is the identification of sets of genes that are always co-expressed. These gene sets Atomic Regulons ARs represent fundamental units of function within a cell and could be used to associate genes of unknown function with cellular processes and to enable rational genetic engineering of cellular systems. Here we describe an approach for inferring ARs that leverages large-scale expression data sets gene context and functional relationships among genes. We computed ARs for Escherichia coli based on 907 gene expression experiments and compared our results with gene clusters produced by two prevalent data-driven methods: hierarchical clustering and k-means clustering. We compared ARs and purely data-driven gene clusters to the curated set of regulatory interactions for E. coli found in RegulonDB showing that ARs are more consistent with gold standard regulons than are data-driven gene clusters. We further examined the consistency of ARs and data-driven gene clusters in the context of gene interactions predicted by Context Likelihood of Relatedness CLR analysis finding that the ARs show better agreement with CLR predicted interactions. We determined the impact of increasing amounts of expression data on AR construction and find that while more data improve ARs it is not necessary to use the full set of gene expression experiments available for E. coli to produce high quality ARs. In order to explore the conservation of co-regulated gene sets across different organisms we computed ARs for Shewanella oneidensis Pseudomonas aeruginosa Thermus thermophilus and Staphylococcus aureus each of which represents increasing degrees of phylogenetic distance from E. coli. Comparison of the organism-specific ARs showed that the consistency of AR gene membership correlates with phylogenetic distance but there is clear variability in the regulatory networks of closely related organisms. As large scale expression data sets become increasingly common for model and non-model organisms comparative analyses of atomic regulons will provide valuable insights into fundamental regulatory modules used across the bacterial domain.
DescriptionThe Supplementary Material for this article can be found online at:
Publisher version
AccessOpen access
Appears in Collections:CEB - Publicações em Revistas/Séries Internacionais / Publications in International Journals/Series

Files in This Item:
File Description SizeFormat 
document_46375_1.pdf1,65 MBAdobe PDFView/Open

Partilhe no FacebookPartilhe no TwitterPartilhe no DeliciousPartilhe no LinkedInPartilhe no DiggAdicionar ao Google BookmarksPartilhe no MySpacePartilhe no Orkut
Exporte no formato BibTex mendeley Exporte no formato Endnote Adicione ao seu ORCID