Constructing Bayesian Network Models of Gene Expression Networks from Microarray Data

Abstract

Through their transcript products genes regulate the rates at which an immense variety of transcripts and subsequent proteins occur. Understanding the mechanisms that determine which genes are expressed, and when they are expressed, is one of the keys to genetic manipulation for many purposes, including the development of new treatments for disease. Viewing each gene in a genome as a distinct variable that is either on or off, or more realistically as a continuous variable, the values of some of these variables influence the values of others through the regulatory proteins they express, including, of course, the possibility that the rate of expression of a gene at one time may, in various circumstances, influence the rate of expression of that same gene at a later time. If we imagine an arrow drawn from each gene expression variable at a given time to a gene variable whose expression it influences a short while after, the result is a network, technically a directed acyclic graph. For example, the DAG in Figure 1 is a representation of a system in which the expression level of gene G1 at time 1 ) causes the expression level of G2, which in turn causes the expression level of G3. The arrows in Figure 1 which do not have a variable at their tails are “error terms” which represent all of the causes of a variable other than the ones explicitly represented in the DAG. The DAG describes more than associations—it describes causal connections among gene expression rates. A shock to a cell—by mutation, heating, chemical treatment, etc. may alter the DAG describing the relations among gene expressions, for example by activating a gene that was otherwise not expressed, producing a cascade of new expression effects. Although “knockout” experiments can reveal some of the underlying causal network of gene expression levels, unless guided by information from other sources, such experiments are limited in how much of the network structure they can reveal, due to the sheer number of possible combinations of experimental manipulations of genes necessary to reveal the complete causal network. Recent developments have made it possible to compare quantitatively the expression of tens of thousands of genes in cells from different sources in a single experiment, and to trace gene expression over time in thousands of genes simultaneously. cDNA microarrays are already producing extensive data, much of it available on the web. Thus there are calls for analytic software that can be applied to microarray and other data to help infer regulatory networks. In this paper we will review current techniques that are available for searching for the causal relations between variables, describe algorithmic and data gathering obstacles to applying these techniques to gene expression levels, and describe the prospects for overcoming these obstacles

Links

PhilArchive



    Upload a copy of this work     Papers currently archived: 91,349

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

  • Only published works are available at libraries.

Similar books and articles

Objective and Subjective Probability in Gene Expression.Joel D. Velasco - 2012 - Progress in Biophysics and Molecular Biology 110:5-10.

Analytics

Added to PP
2014-04-05

Downloads
20 (#747,345)

6 months
5 (#629,136)

Historical graph of downloads
How can I increase my downloads?

Author Profiles

Clark Glymour
Carnegie Mellon University
Richard Scheines
Carnegie Mellon University

Citations of this work

No citations found.

Add more citations

References found in this work

No references found.

Add more references