Comparison of Statistical Tests and Power Analysis for Phosphoproteomics Data
JOURNAL OF PROTEOME RESEARCH(2020)
摘要
Advances in protein tagging and mass spectrometry have enabled generation of large quantitative proteome and phosphoproteome data sets, for identifying differentially expressed targets in case control studies. The power study of statistical tests is critical for designing strategies for effective target identification and control of experimental cost. Here, we develop a simulation framework to generate realistic phospho-peptide data with known changes between cases and controls. Using this framework, we quantify the performance of traditional t-tests, Bayesian tests, and the ranking-by-fold-change test. Bayesian tests, which share variance information among peptides, outperform the traditional t-tests. Although ranking-by-fold-change has similar power as the Bayesian tests, its type I error rate cannot be properly controlled without proper permutation analysis; therefore, simply relying on the ranking likely brings false positives. Two-sample Bayesian tests considering dependencies between intensity and variance are superior for data sets with complex variance. While increasing the sample size enhances the statistical tests' performance, balanced controls and cases are recommended over a one-side weighted group. Further, higher peptide standard deviations require higher fold changes to achieve the same statistical power. Together, these results highlight the importance of model-informed experimental design and principled statistical analyses when working with large-scale proteomics and phosphoproteomics data.
更多查看译文
关键词
quantitative phosphorpoteomics,Bayesian statistics,empirical variance,proteomics,multiplex,two-sample,bioinformatics,neuroproteomics,hierachical simulation,sample size
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn