Seminars in Hematology
Volume 45, Issue 3 , Pages 135-140 , July 2008

A Dirty Dozen: Twelve P-Value Misconceptions

  • Steven Goodman

      Affiliations

    • Corresponding Author InformationAddress correspondence to Steven Goodman, MD, MHS, PhD, 550 N Broadway, Suite 1103, Baltimore, MD, 21205.

References 

  1. Garcia-Berthou E, Alcaraz C. Incongruence between test statistics and P values in medical papers. BMC Med Res Methodol. 2004;4:13
  2. Andersen B. Methodological Errors in Medical Research. Oxford, UK: Blackwell Science; 1990;
  3. Windish DM, Huot SJ, Green ML. Medicine residents' understanding of the biostatistics and results in the medical literature. JAMA. 2007;298:1010–1022
  4. Berkson J. Tests of significance considered as evidence. J Am Stat Assoc. 1942;37:325–335
  5. Mainland D. The significance of “nonsignificance.”. Clin Pharm Ther. 1963;5:580–586
  6. Mainland D. Statistical ritual in clinical journals: Is there a cure? —I. Br Med J. 1984;288:841–843
  7. Edwards W, Lindman H, Savage LJ. Bayesian statistical inference for psychological research. Psych Rev. 1963;70:193–242
  8. Diamond GA, Forrester JS. Clinical trials and statistical verdicts: Probable grounds for appeal. Ann Intern Med. 1983;98:385–394
  9. Feinstein AR. P-values and confidence intervals: Two sides of the same unsatisfactory coin. J Clin Epidemiol. 1998;51:355–360
  10. Feinstein AR. Clinical biostatistics (XXXIV. The other side of ‘statistical significance’: Alpha, beta, delta, and the calculation of sample size). Clin Pharmacol Ther. 1975;18:491–505
  11. Rothman K. Significance questing. Ann Intern Med. 1986;105:445–447
  12. Pharoah P. How not to interpret a P value?. J Natl Cancer Inst. 2007;99:332–333
  13. Goodman SN, Royall R. Evidence and scientific research. Am J Public Health. 1988;78:1568–1574
  14. Braitman L. Confidence intervals extract clinically useful information from data. Ann Intern Med. 1988;108:296–298
  15. Goodman SN. Towards evidence-based medical statistics, I: The P-value fallacy. Ann Intern Med. 1999;130:995–1004
  16. Goodman SN. P-values, hypothesis tests and likelihood: Implications for epidemiology of a neglected historical debate. Am J Epidemiol. 1993;137:485–496
  17. Fisher RA. Statistical Methods for Research Workers. Oxofrd, UK: Oxford University Press; 1958;
  18. Royall R. Statistical Evidence: A Likelihood Paradigm. London, UK: Chapman & Hall; 1997;
  19. Gigerenzer G, Swijtink Z, Porter T, Daston L, Beatty J, Kruger L. The Empire of Chance. Cambridge, UK: Cambridge University Press; 1989;
  20. Lehmann EL. The Fisher, Neyman-Pearson theories of testing hypotheses: One theory or two?. J Am Stat Assoc. 1993;88:1242–1249
  21. Lilford RJ, Braunholtz D. For debate: The statistical basis of public policy: A paradigm shift is overdue. BMJ. 1996;313:603–607
  22. Greenland S. Bayesian perspectives for epidemiological research: I (Foundations and basic methods). Int J Epidemiol. 2006;35:765–775
  23. Greenland S. Randomization, statistics, and causal inference. Epidemiology. 1990;1:421–429
  24. Goodman SN. Towards evidence-based medical statistics, II: The Bayes' factor. Ann Intern Med. 1999;130:1005–1013
  25. Rothman KJ. A show of confidence. N Engl J Med. 1978;299:1362–1363
  26. Gardner MJ, Altman DG. Confidence intervals rather than p values: Estimation rather than hypothesis testing. Stat Med. 1986;292:746–750
  27. Simon R. Confidence intervals for reporting results of clinical trials. Ann Intern Med. 1986;105:429–435
  28. Goodman SN. Introduction to Bayesian methods I: Measuring the strength of evidence. Clin Trials. 2005;2:282–290
  29. Berry DA. Interim analyses in clinical trials: Classical vs. Bayesian approaches. Stat Med. 1985;4:521–526
  30. Berger JO, Berry DA. Statistical analysis and the illusion of objectivity. Am Sci. 1988;76:159–165
  31. Ioannidis JP. Why most published research findings are false. PLoS Med. 2005;2:e124
  32. Ioannidis JP. Genetic associations: False or true?. Trends Mol Med. 2003;9:135–138
  33. Goodman SN. One or two-sided P-values?. Control Clin Trials. 1988;9:387–388
  34. Bland J, Altman D. One and two sided tests of significance. BMJ. 1994;309:248
  35. Boissel JP. Some thoughts on two-tailed tests (and two-sided designs). Control Clin Trials. 1988;9:385–386(letter)
  36. Peace KE. Some thoughts on one-tailed tests. Biometrics. 1988;44:911–912(letter)
  37. Fleiss JL. One-tailed versus two-tailed tests: Rebuttal. Control Clin Trials. 1989;10:227–228(letter)
  38. Knottnerus JA, Bouter LM. The ethics of sample size: Two-sided testing and one-sided thinking. J Clin Epidemiol. 2001;54:109–110
  39. Kass RE, Raftery AE. Bayes' factors. J Am Stat Assoc. 1995;90:773–795
  40. Berger JO, Sellke T. Testing a point null hypothesis: The irreconcilability of P-values and evidence. J Am Stat Assoc. 1987;82:112–122
  41. Greenland S: Bayesian interpretation and analysis of research results. Semin Hematol (this issue)
  42. Lang JM, Rothman KJ, Cann CI. That confounded P-value. Epidemiology. 1998;9:7–8

PII: S0037-1963(08)00062-0

doi: 10.1053/j.seminhematol.2008.04.003

Seminars in Hematology
Volume 45, Issue 3 , Pages 135-140 , July 2008