Sample Size Determination in Genetic-Disease Association Studies when the Response Variable is Subject to Misclassification and a Surrogate Covariate is Used pp. 135-156
Authors: (Munechika Misumi, Tomomi Yamada, Tsuyoshi Nakamura, Yoshiaki Nose, Department of Statistics, Radiation Effects Research Foundation, Hiroshima, Japan, and others)
Abstract: In a genetic-disease association study, a retrospective study is first performed to find significant associations between a genetic marker (SNP, genotype, haplotype, etc.) and certain phenotypes of interest (response to treatment, occurrence of a disease, etc.). If a medically important and statistically significant association is found, a prospective study is then designed to obtain definite evidence to confirm the association. One of the critical issues in designing such a study is sample size determination. Accordingly, Yamada et al. (2009) developed a simulation program to determine the sample size required for a clinical study to confirm a genetic disease association observed in a retrospective exploratory study. Although the program can be applied to a wide area of biomedical research, it is not relevant in certain practical situations. In this chapter, their technique is extended to cover a much broader range of applications. A computer program is developed to implement our idea, using the free statistical software R. The program developed by Yamada et al. (2009) can accommodate random misclassifications in the response variable of a binary logistic regression model. They applied the Pitman asymptotic relative efficiency (ARE) to obtain an equation to calculate the sample size for a test, replacing the exact response value with a surrogate value that leaves the power of the test unaltered. In a related approach, Lagakos (1988) utilized ARE to assess the loss of power caused by random measurement errors of an explanatory variable. An advantage of using ARE is that both the computational time and the number of parameters required to start the program are drastically reduced. In this chapter, the ideas are extended to allow for random measurement errors in both response and explanatory variables, and an implementation is developed using the free statistical software R.
Open Access item.
Click below PDF icon for free download.
This is an Open Access item. Click above PDF icon for free download.