Notes: Reconstructing True Transcription Factor Activities

(back to index)

 

Overview:

Transcription factors are subject to strong post-translational modifications. This has the consequence that mRNA concentrations, as measured, e.g., by conventional microarray experiments cannot serve as reliable proxies for the transcription factor activities (TFAs). However, it turns out that it is possible to estimate the TFAs from microrarray data (i.e. to correct the expression data) by combining them in a suitable fashion with external connectivity information, such as from ChIP experiments.

This approach has been pioneered by the group around J. C. Liao at UCLA who proposed the approach of Network Component Analysis (NCA) in a series of papers:

  1. Y.-L. Yang et al., J. C. Liao 2005. Inferring yeast cell cycle regulators and interactions using transcription factor activities. BMC Genomics 6:90.
  2. L. M. Tran et al., J. C. Liao. 2005. gNCA: A framework for determining transcription factor activity based on transcriptome: identifiability and numerical implementation. Metab. Engin. 7:128-141.
  3. R. Boscolo, C. Sabatti, J. C. Liao., and V. P. Roychowdhury. 2005. Reconstructing hidden regulatory layers by network component analysis: theory and applications. IEEE Trans. Comput. Biol. Bioinf. in press
  4. K. C. Kao et al., J. C. Liao. 2004. Transcriptome-based determination of multiple transcription regulator activities in Escherichia coli by using network component analysis. PNAS 101: 641-646.
  5. J. C. Liao, et al. 2003. Network component analysis: reconstruction of regulatory signals in biological systems. PNAS 100:15522-15527.

 

Statistical Approaches to NCA:

The original NCA algorithm to infer the TFAS is rather adhoc and does not take account of stochasticity, so several groups have attempted to cast the method into a more statistical framework:

  1. F. Gao, B. C. Foat, and H. J. Bussemaker. 2004. Defining transcriptional networks trough integrative modeling of mRNA expression and transcription factor binding data. BMC Bioinformatics 5:31
  2. C. Sabatti and G. James. 2006. Bayesian sparse hidden components analysis for transcription regulation networks. Bioinformatics, in press. (UCLA Statistics Preprint #414)
  3. A.-L. Boulesteix and K. Strimmer. 2005. Predicting transcription factor activities from combined analysis of microarray and ChIP data: a partial least squares approach. Theor. Biol. Med. Model. 2: 23. (preprint)

Gao et al. (2004) suggest to use linear regression with step-wise model selection. Sabatti and James (2005) offer a Bayesian approach to NCA. Our suggestion (Boulesteix and Strimmer 2005) is to use partial least-squares regression to solve the NCA problem and estimate the TFAs (please refer to the paper for details and a comparison of the above approaches).

 

Further Related References:

  1. Z. Li and C. Chan. 2004. Extracting novel information from gene expression data. Trends in Biotech. 22:381-383.
  2. D. F. Simola. 2004. Prediction of transcription factor gene expression in Saccaromyces cerevisae. Technical report, Univ. Pennsylvania.
  3. O. Alter and G. H. Golub. 2004. Integrative analysis of genome-scale data by using pseudoinverse projection predicts novel correlation between DNA replication and RNA transcription. PNAS 101:16577-16852.

 

Please drop me me a line (korbinian.strimmer@lmu.de) for suggestions and comments.

 

Last modified:
July 21, 2005

Valid XHTML 1.1