Seminars in Hematology
Volume 45, Issue 3 , Pages 196-204 , July 2008

Interpretation of Genomic Data: Questions and Answers

  • Richard Simon

      Affiliations

    • Corresponding Author InformationAddress correspondence to Richard Simon, DSc, Biometric Research Branch, National Cancer Institute, 9000 Rockville Pike, MSC 7434, Bethesda MD 20892-7434.

References 

  1. Piccart-Gebhart MJ, Procter M, Leyland-Jones B, Goldhirsch A, Untch M, Smith I, et al. Trastuzumab after adjuvant chemotherapy in HER2-positive breast cancer. N Engl J Med. 2005;353:1659–1672
  2. Paik S, Taniyama Y, Geyer CE. Anthracyclines in the treatment of HER2-negative breast cancer. J Natl Cancer Inst. 2008;100:2–3
  3. Kattan MW. Judging new markers by their ability to improve predictive accuracy. J Natl Cancer Inst. 2003;95:634–635
  4. Simon R. When is a genomic classifier ready for prime time?. Nat Clin Pract Oncol. 2004;1:2–3
  5. Simon R, Lam A, Li MC, Ngan M, Menenzes S, Zhao Y. Analysis of gene expression data using BRB-ArrayTools. Cancer Informatics. 2007;2:11–17
  6. Simon R, Korn E, McShane L, Radmacher M, Wright G, Zhao Y. Design and Analysis of DNA Microarray Investigations. New York, NY: Springer Verlag; 2003;
  7. Dupuy A, Simon R. Critical review of published microarray studies for cancer outcome and guidelines on statistical analysis and reporting. J Natl Cancer Inst. 2007;99:147–157
  8. Benjamini Y, Hochberg Y. Controlling the false discovery rate: A practical and powerful approach to multiple testing. J R Stat Soc B. 1995;57:289–300
  9. Storey JD. A direct approach to false discovery rates. J R Stat Soc B. 2002;64:479–498
  10. Wu B, Guan Z, Zhao H. Parametric and nonparametric FDR estimation revisited. Biometrics. 2006;62:735–744
  11. Tusher VG, Tibshirani R, Chu G. Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci. 2001;98:5116–5121
  12. Korn EL, Li MC, McShane LM, Simon R. An investigation of SAM and the multivariate permutation test for controlling the false discovery proportion. Stat Med. 2007;26:4428–4440
  13. Storey JD, Xiao W, Leek JT, Tomkins RG, Davis RW. Significance analysis of time course microarray experiments. Proc Natl Acad Sci. 2005;102:12837–12842
  14. Shih JH, Michalowska AM, Dobbin K, Ye Y, Qui TH, Green JE. Questions and answers on design of dual-label microarrays for identifying differentially expressed genes. J Natl Cancer Inst. 2003;95:1362–1369
  15. Dobbin K, Simon R. Sample size determination in microarray experiments for class comparison and prognostic classification. Biostatistics. 2005;6:27–38
  16. Shih JH, Michalowska AM, Dobbin K, et al. Effects of pooling mRNA in microarray class comparison. Bioinformatics. 2004;20:3318–3325
  17. Subramanian A, Tamayo P, Mootha VK. Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci. 2005;102:15545–15550
  18. Tian L, Greenberg SA, Kong SW, Altschuler J, Kohane IS, Park PJ. Discovering statistically significant pathways in expression profiling studies. Proc Natl Acad Sci. 2005;102:13544–13549
  19. Pavlidis P, Qin J, Arango V, Mann JJ, Sibille E. Using the gene ontology for microarray data mining: A comparison of methods and application to age effects in human prefrontal cortex. Neurochem Res. 2004;29:1213–1222
  20. Kong SW, Pu WT, Park PJ. A multivariate approach for integrating genome-wide expression data and biological knowledge. Bioinformatics. 2006;22:2373–2380
  21. Goeman JJ, Buhlmann P. Analyzing gene expression data in terms of gene sets: Methodological issues. Bioinformatics. 2007;23:980–987
  22. Xu X, Zhao Y, Simon R. Gene sets expression comparison in BRB-ArrayTools. Bioinformatics. 2008;24:137–139
  23. Breiman L, Friedman JH, Olshen RA, Stone PJ. Classification and Regression Trees. Belmont, CA: Wadsworth International Group; 1984;
  24. Dudoit S, Fridlyand J. Classification in microarray experiments. In:  Speed T editors. Statistical Analysis of Gene Expression Microarray Data. Boca Raton, FL: Chapman & Hall/CRC; 2003;p. 93–158
  25. Dudoit S, Fridlyand J, Speed TP. Comparison of discrimination methods for the classification of tumors using gene expression data. J Am Stat Assoc. 2002;97:77–87
  26. Ben-Dor A, Bruhn L, Friedman N, Nachman I, Schummer M, Yakhini Z. Tissue classification with gene expression profiles. J Comput Biol. 2000;7:559–584
  27. Wessels LFA, Reinders MJT, Hart AAM, Veenman CJ, Dai H, He T, et al. A protocol for building and evaluating predictors of disease state based on microarray data. Bioinformatics. 2005;21:3755–3762
  28. Lai C, Reinders MJT, van't Veer LJ, Wessels LF. A comparison of univariate and multivariate gene selection techniques for classification of cancer datasets. BMC Bioinformatics. 2006;7:235
  29. Varma S, Simon R. Bias in error estimation when using cross-validation for model selection. BMC Bioinformatics. 2006;7:91
  30. Molinaro AM, Simon R, Pfeiffer RM. Prediction error estimation: A comparison of resampling methods. Bioinformatics. 2005;21:3301–3307
  31. Michiels S, Koscielny S, Hill C. Prediction of cancer outcome with microarrays: A multiple validation strategy. Lancet. 2005;365:488–492
  32. Ambroise C, McLachlan GJ. Selection bias in gene extraction on the basis of microarray gene-expression data. Proc Natl Acad Sci. 2002;99:6562–6566
  33. Radmacher MD, McShane LM, Simon R. A paradigm for class prediction using gene expression profiles. J Comput Biol. 2002;9:505–512
  34. Simon R, Radmacher MD, Dobbin K, McShane LM. Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification. J Natl Cancer Inst. 2003;95:14–18
  35. Lusa L, McShane LM, Radmacher MD, Shih JH, Wright GW, Simon R. Appropriateness of inference procedures based on within-sample validation for assessing gene expression microarray-based prognostic classifier performance. Stat Med. 2007;26:1102–1113
  36. Ioannidis JPA. Is molecular profiling ready for use in clinical decision making?. Oncologist. 2007;12:301–311
  37. Bair E, Tibshirani R. Semi-supervised methods to predict patient survival from gene expression data. PLoS Biol. 2004;2:511–522
  38. Gui J, Li H. Penalized Cox regression analysis in the high-dimensional and low-sample size settings, with applications to microarray gene expression data. Bioinformatics. 2005;21:3001–3008
  39. Simon R. Evaluating prognostic factor studies. In:  Gospodarowicz MK editors. Prognostic factors in cancer. ed 2. New York, NY: Wiley-Liss; 2002;p. 49–56
  40. Paik S, Shak S, Tang G, Kim C, Baker J, Cronin M, et al. A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. N Engl J Med. 2004;351:2817–2826
  41. Paik S. Development and clinical utility of a 21-gene recurrence score prognostic assay in patients with early breast cancer treated with tamoxifen. Oncologist. 2007;12:631–635
  42. Simon R. A roadmap for developing and validating therapeutically relevant genomic classifiers. J Clin Oncol. 2005;23:7332–7341
  43. Simon R, Maitournam A. Evaluating the efficiency of targeted designs for randomized clinical trials: Supplement and correction. Clin Cancer Res. 2006;12:3229
  44. Simon R, Maitournam A. Evaluating the efficiency of targeted designs for randomized clinical trials. Clin Cancer Res. 2005;10:6759–6763
  45. Freidlin B, Simon R. Adaptive signature design: An adaptive clinical trial design for generating and prospectively testing a gene expression signature for sensitive patients. Clin Cancer Res. 2005;11:7872–7878
  46. van't Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AA, Mab M, et al. Gene expression profiling predicts clinical outcome of breast cancer. Nature. 2002;415:530–536
  47. Bogaerts J, Cardoso F, Buysa M, Braga S, Loi S, Harrison JA, et al. Gene signature evaluation as a prognostic tool: Challenges in the design of the MINDACT trial. Clin Pract Oncol. 2006;3:540–551
  48. Fan C, Oh DS, Wessels L, et al. Concordance among gene-expression based predictors for breast cancer. N Engl J Med. 2006;355:560–569
  49. Ein-Dor L, Zuk O, Domany E. Thousands of samples are needed to generate a robust gene list for predicting outcome in cancer. Proc Natl Acad Sci. 2006;103:5923–5928
  50. Dobbin K, Simon R. Sample size planning for developing classifiers using high dimensional DNA expression data. Biostatistics. 2007;8:101–117
  51. Dobbin KK, Zhao Y, Simon RM. How large a training set is needed to develop a classifier for microarray data?. Clin Cancer Res. 2008;14:108–114
  52. Pusztai L, Ayers M, Stec J, et al. Clinical application of cDNA microarrays in oncology. Oncologist. 2003;8:252–258
  53. Kyzas PA, Denaxa-Kyza D, Ioannidis JP. Almost all articles on cancer prognostic markers quote statistically significant results. Eur J Cancer. 2007;43:2559–2579

PII: S0037-1963(08)00068-1

doi: 10.1053/j.seminhematol.2008.04.008

Seminars in Hematology
Volume 45, Issue 3 , Pages 196-204 , July 2008