PHSL 6504 UMW Biology Sample Size and Statistical Power Questions
PHSL 6504 UMW Biology Sample Size and Statistical Power Questions
Description
Unformatted Attachment Preview
PHSL 6504 HOMEWORK 11 Sample Size and Statistical Power NAME________________ Goals Learn how to determine the sample size needed for an experiment or a study. Understand and review the concepts of type I and II errors, alpha, beta, and power. Practice using Power and Sample size (SWOG statool) software. Reading Assignment Read Chapter 10, 14, 22 in your textbook. Introduction Whenever possible, the size of a study should be carefully planned BEFORE it gets underway. This is because the results and interpretation of the study will often highly depend on the sample size, as we have already seen many times in this course. It is often the case that a statistical procedure such as the chi-square test will be more likely to yield a statistically significant result when the sample size is large, but then the question becomes “how large is enough?” especially when resources for conducting the study are finite. To approach the problem of sample size, it is useful to review type I and II errors. Review the following table. Results of statistical testing in the study sample: Reject the null hypothesis Accept the null hypothesis The truth in the population: Association between predictor and No association between predictor outcome variables and outcome variables CORRECT TYPE I ERROR TYPE II ERROR CORRECT This table shows that there are 4 possibilities when it comes to comparing a statistical text result with reality. In two of these possibilities, the findings in the sample and in the population are concordant and the investigator’s inference is correct. A type I error (false positive result) occurs when the investigator rejects a null hypothesis which is actually true in the population. A type II error (false negative result) occurs when the investigator accepts a null hypothesis that is actually false in the population. Both of these errors lead to incorrect statistical inference. Ideally, the investigator should think about these issues ahead of time and establish the maximum chances he or she will tolerate of making these types of errors. The probability of making a type I error is called alpha, also known as the level of statistical significance. In many studies, alpha is set at 0.05, which means that at maximum there is 5% chance of incorrectly rejecting the null hypothesis. This can be thought of as the level of reasonable doubt that the investigator is willing to accept when he or she uses statistical tests to analyze the data. Sometimes it may be desirable to be more stringent and set alpha much lower, perhaps at 0.01 or 0.001 depending on the real-life consequences of making a type I error. The probability of making a type II error is called beta. The function [1- beta] is called power, defined in the textbook as “the probability of avoiding a type II error.” If beta is set at 0.10, for example, one is willing to accept a 10% chance of missing a true association in the data; this corresponds to 90% power, i.e. a 90% chance of avoiding a type II error. Example: An investigator plans a study of the association of dietary beta carotene and colon cancer. She sets alpha at .05 and beta at 0.10 for the statistical analysis. This means that there will be a 5% chance of incorrectly rejecting the null hypothesis and falsely concluding that beta carotene is related to colon cancer risk. Now, suppose that the study found that there was a 30% reduction in colon cancer incidence when a high level of beta carotene was consumed. With 90% power (beta= 0.10), it could be concluded that 90 times out of 100 the investigator would observe a cancer reduction of 30% or greater. 1 Ideally, alpha and beta would be set at zero, to eliminate any possibility of false inference, but in practice they are set as small as possible and practical. However, it should be noted that alpha and beta are inversely related: lowering one of them will increase the other one, and vice-versa. This is where the sample size comes in, and where the investigator has to carefully consider the consequences of manipulating alpha, beta, and the sample size. The effect size also enters into the picture, since it will be easier to correctly detect a large (or strong) effect in the data than to detect a very small (or weak) one. Also, if the variability is small then it is easier to find significant effects compared to situations where the variability in the sample is large. Therefore, the following steps should be taken when determining the appropriate sample size for a study. 1. 2. 3. 4. Set alpha and beta. Decide on the effect size you wish to detect. Obtain data on (or make an educated guess about) the standard deviation in the study sample. Consult a table or software to determine the required sample size. Example (in lecture): A researcher is studying a drug to reduce cholesterol levels. If the drug is effective, we would expect to observe a lowering of the cholesterol level when the study is finished. Two groups of adults with hypertension will be studied in a randomized controlled trial: those in Group 1 will not be taking any cholesterol-related medications and will get a placebo, and those in Group 2 will take the new drug. Prior research (a large, random survey of adults in the U.S. population) reported that the mean total cholesterol level of people who didn’t use any cholesterol lowering drugs was 200 mg/dl, with a standard deviation of 30. Assume that the cholesterol readings were normally distributed; therefore the statistical inference test in our study will be the 2-sample (Student’s) T test. We want to design an experiment with 95% power to detect a mean lowering of 20 mg/dl, since we think this will be a meaningful drop in cholesterol, and we set alpha at 0.01 because we do not want to risk marketing an ineffective new drug. How many subjects per group are needed to observe the desired effect or greater? To find the sample size, we will use the web-based SWOG Statool again as demonstrated in class. Go to the Design section and click on Two-Arm Normal. We now have to answer a series of questions and provide required information. • The first is to select sample size • Then choose two-sided • The Input area has several items to fill. Start with alpha: enter 0.01 • For power, enter 0.95 • For the difference in means (δ, or delta), enter 20 (from 200-180=20 in the example) • For the population standard deviation (σ, or sigma), enter 30 • For ratio of sample size, enter 1 (each group has equal # of subjects) • Now click on Calculate, and you will see the answer appear in the sample size box (it should now show 161). For this example, the number 161 indicates that we need 81 subjects in the placebo group and 81 in the test group, given the assumptions we have just made about alpha, power, effect size (difference in means), and standard deviation. If we change any of these parameters, then the sample size will be different. We will explore this in today’s homework problems. The message from today’s lecture is well summarized by J. Susan Milton, author of Statistical Methods in the Biological and Health Sciences, who wrote (p. 244): “Unplanned experiments are often poorly conducted experiments. Samples sizes must be chosen carefully, and in general small samples yield small power. From the outset, experiments based on small samples are usually doomed to failure unless the difference between u0 and u1 is extreme.” 2 Power and Sample Size Worksheet Exercise A group of researchers in Baltimore, Maryland, is interested in conducting a study to test the hypothesis that genetic susceptibility to mutagens is related to lung cancer risk. The mutagen sensitivity assay (MSA) is useful for such as study because it provides an overall index of the genetically-based ability of individuals to repair DNA damage. The MSA protocol takes lymphocytes from subjects and cultures them, then exposes them to a dose of a mutagen such as gamma radiation to induce chromosome breaks. After allowing the cells to repair themselves for a short period of time, the number of remaining chromosomal breaks in a random sample of 50 cells from each subject is counted under a microscope. The number of chromosome breaks per cell (b/c) is considered a biomarker of susceptibility to DNA damage: the higher the number of b/c the higher the risk for mutagenic diseases such as tobacco induced lung cancer. The researchers conducted a pilot study and evaluated the MSA in 20 individuals – 10 with lung cancer and 10 without cancer. The mean b/c was 1.1 in the cases and 0.8 in the controls. The pooled standard deviation was 0.5 b/c. Now the researchers need to decide how many subjects should be recruited for the main study, for which 90% power is desired. The statistical inference test will be the student’s T test. The sample size (i.e. the number of cases, using 1 control per case) will dictate how much grant money is needed for the study and how it will be spent, so there are many practical as well as scientific consequences to the problem of determining the best sample size. This exercise will show you how to approach this question, and you will see how alpha, power, the effect size, and measurement precision can affect the required sample size. Scenario One: Alpha For this part of the exercise, we want to see the sample size effect of manipulating alpha while holding other factors constant. Use the data achieved in the pilot study, and determine the sample size needed when alpha takes the following values. Start by calculating delta (the difference in means), and then go to PS software with the appropriate numbers to enter for alpha (starting with 0.01 as listed in the table below) and also for power, delta, sigma, and m. You will keep everything constant in subsequent iterations of the program, but changing alpha each time. Enter the computed samples sizes below. What trend do you see? Alpha sample size .01 .02 .05 0.10 Trend: __________________________________________________________________________________ ________________________________________________________________________________________ Scenario Two: Effect Size For this part of the exercise, we want to see the sample size consequences of hypothesizing different effect sizes, i.e. different levels of case-control differences in the mean b/c. Assume that power is 90% and alpha is 0.05, but consider the following effect sizes in the table below, while holding the other factors constant. Use the standard deviation observed in the pilot study. Start by calculating delta, then go to PS software with the appropriate entry numbers for all the other parameters. Enter the required samples sizes below. What trend do you see? b/c in cases b/c in controls delta sample size 1.1 0.8 1.25 0.8 1.30 0.8 1.50 0.8 Trend: __________________________________________________________________________________ ________________________________________________________________________________________ 3 Scenario Three: Measurement Precision Now we will examine the effect of increasing the precision of the study on the sample size. In terms of the MSA, a more precise measurement of b/c would have a smaller standard deviation then a less precise measurement of b/c. This is not as unreasonable as it mean seem at first. Given more practice in the lab and better ongoing quality control, it might be reasonable to expect that the counting of breaks will produce fewer errors in judgment and hence a lower standard deviation. Assume that power is 90%, alpha is 0.05, and the effect size is 0.2 b/c, but then consider the following standard deviations while holding those other factors constant. Enter the required samples sizes below. What trend do you see? standard deviation sample size 0.5 0.4 0.3 0.2 Trend: __________________________________________________________________________________ ________________________________________________________________________________________ Scenario Four: Power For this part of the exercise, we want to see the sample size effect of manipulating the power of the study while holding other factors constant. Use the data achieved in the pilot study, and determine the sample size needed when power takes the following values. Assume alpha is always 0.01. Enter the required samples sizes below. What trend do you see? Power Sample size 99% 95% 90% 80% Trend: __________________________________________________________________________________ ______________________________________________________________________________________________________ Part Five: Overall Concepts As you have seen, the concepts of power, alpha, and beta are important determinants of the sample size that need to be considered when designing a study. But not all investigators think about this aspect of study design before starting their experiments or observational studies, and they could end up with a study that has more subjects than actually needed, or too few. What would be the negative consequences of having a sample size that is larger than required to achieve adequate power? _________________________________________________________________________ _______________________________________________________________________________________ _______________________________________________________________________________________ _______________________________________________________________________________________ What would be the negative consequences of having a study where the sample size is too small? _______________________________________________________________________________________ _______________________________________________________________________________________ 4 var1 var2 1 2 2 3 3 3 3 4 4 5 2 4 4 5 5 5 7 7 group 0 0 1 0 0 1 1 0 1 1 0 0 1 0 0 1 0 0 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2
Purchase answer to see full attachment
Purchase answer to see full attachment
Explanation & Answer:
5 Questions