Chapter Eight (Salkind) Are Your Curves Normal? Probability and Why It Counts Warning! ▪ This PowerPoint is long ▪ I promise we will get through it together! ▪ There are also a lot of examples for you to go through ▪ Take your time going through this one! Take breaks! Don’t try to do this all at once. An Overview of This Chapter Are your curves normal? Thus far, we have discussed descriptive statistics and how they can help us describe a data set. But now we need to get a better picture about what the data really look like. Does the phrase “normal curve” ring a bell? This chapter is all about the normal, bell-shaped curve … An Overview of This Chapter In this chapter we cover the following items … Part One: Why Probability? Part Two: The Normal Curve (A.K.A The Bell-Shaped Curve) Part Three: Our Favorite Standard Score: The z Score Part Four: Using The Computer To Compute z Scores Part Five: An Eye Toward The Future Part One Why Probability? Why Probability? Why Probability? Our first task in this chapter is to define what we mean by the phrase “probability”. What does that word mean to you? Have you ever thought, “It will probably rain today”? When you think this, do you know it for sure it will rain, or is it just likely to rain (especially when compared to no rain)? I’m betting it is the latter. Common sense, past experience, and world knowledge (like weather reports!) factor into probability Why Probability? Life is all about probability, but you probably don’t realize just how often you engage in intuitive probability calculations to assess the likelihood of some event happening. For example: What are the odds of getting a royal flush in poker? Very low, right. What are the odds of getting one pair? Much higher. If you had to bet your next student loan payment on whether you would get a royal flush in the next hand or a single pair, which option would you choose? The Bell Curve and Probability 1). In statistics, the normal (bell-shaped) curve provides us with a similar basis for understanding the probability associated with any possible outcome – The odds of you getting a good score on your next quiz are high (provided you study the material, right!) Statistical Significance 2). But probability also focuses on the extent to which we can have confidence that a particular outcome is true – Or (to put it in terms of research methodology), probability gives us the ability to calculate the odds that the research outcome we find was (or was not) due to chance – This is what the idea of statistical significance is based on! p < .05 Why Probability?: Example Study Imagine we conduct a study to determine how long it takes salesclerks to approach and help hearing-disabled customers – We have a confederate enter a store who is either clearly deaf (she uses sign-language to converse with a friend as they enter the store) or she enters clearly “hearing-abled” (she is talking to her friend – this is our “control” group). – Our independent variable is the deaf group vs. control group – Our dependent variable is the amount of time it takes for a clerk to approach and help the customer – Will we get differences in clerk “approach time” depending on whether the customer is deaf or not? Is the Difference Significant? – If differences do emerge between these two IV levels, can we say confidently that the differences are due to deafness, or might there be another variable at play (like chance) that better accounts for any observed differences? The Null Hypothesis This brings us to the idea of the null hypothesis. – Consider the idea of “significance”—or the likelihood that a statistical outcome did or did not occur by chance That is, if an outcome did occur by chance, we conclude that the IV is not responsible for differences in the DV We call this the “null hypothesis” If not by chance, we accept the alternative hypothesis, (something other than chance brought about the results: most probably our independent variable manipulation!) The Null and Alternative Hypotheses A. Null hypothesis (H0): This implies that the mean scores from the treatment and control groups are equal (µc = µt) – If we support the null hypothesis, differences between the different study groups are due to chance (and not to the IV) B. Alternative hypothesis (H1): The mean scores are different (Treatment ≠ Control), or (µc ≠ µt) – Our alternative hypothesis is that salesclerks will approach one group faster (or slower!) than the other Note: That symbol µ (or “mu”) is the mean Part Two The Normal Curve (A.K.A. The Bell-Shaped Curve) The Normal Curve The Normal Curve (A.K.A. The Bell-Shaped Curve) As you can see, the normal curve is “bell-shaped”. Think about the line drawn right down the middle of the curve. In a “perfect” normal curve, this line represents the mean, median, and mode Describing The Normal Curve As you can see, the normal curve is “bell-shaped”. Think about the line drawn right down the middle of the curve. In a “perfect” normal curve, this line represents the mean, median, and mode – That is, the most frequent number in the data set (the mode) is in the middle of the curve – The middle value in the data set (median) is in the middle of the curve (half the scores fall above and half below) – The average value of the data set (mean) is in the middle of the curve Non-Normal Curves In comparison, consider the non-normal curves below Blue is a positive skew Green is a negative skew Bimodal Curves – Of course, we could also have a bimodal (two modes) curve – But the data that works best for statisticians involves the nice, normal bell-shaped curve. To reiterate (yep, again!) … Properties Of The Normal Curve 1). A normal curve is not skewed. It has a nice, single hump in the middle, and it is symmetrical (if you fold it in half, the left side will mirror the right side) 2). It is also asymptotic, which means that the tails of the curve do not touch the x axis. They may come close, but never touch! The Normal Curve Let’s go back to our example. If we observe how long it takes clerks to approach deaf customers, we might get a broad range of times. – We can plot the amount of time it takes to help each deaf customer. Let’s say the mean (average) time to help deaf customers is 4.00 minutes This is actually true! (McClellan & Woods, 2001) Frequencies – Consider this data (I am just making this set up!) 1 minute (two clerks) 12 2 minutes (three clerks) 123 3 minutes (five clerks) 12345 4 minutes (six clerks) 123456 5 minutes (five clerks) 12345 6 minutes (three clerks) 123 7 minutes (two clerks) 12 Let’s Look At The Data This Way – Consider this data (I am just making this set up!) 1 minute (two clerks) OO 2 minutes (three clerks) OOO 3 minutes (five clerks) OOOOO 4 minutes (six clerks) OOOOOO 5 minutes (five clerks) OOOOO 6 minutes (three clerks) OOO 7 minutes (two clerks) OO Does This Shape Look Familiar? – Consider this data (I am just making this set up!) 1 minute (two clerks) 2 minutes (three clerks) 3 minutes (five clerks) 4 minutes (six clerks) 5 minutes (five clerks) 6 minutes (three clerks) 7 minutes (two clerks) It’s A Normal Distribution! – The scores vary (the range is 1 minute to 7 minutes), but a larger grouping falls in the middle of the normal curve (around 4 min.). 1 2 3 4 5 6 7 1 2 3 4 5 6 7 The Normal Curve: Another Example We will “compare normal curves” later. For now, just recognize that normal curves appear frequently on our planet! We often see a big grouping of characteristics that fall in the middle of a data set. But at the extremes, there are fewer data points This Is A Normal Curve Many Data Sets Are Normal Admittedly, there are data sets that are not normal. However, when we deal with large data sets we tend to see FEW extreme high values and extreme low values but LOTS of middle values – You can find lots of 6.5 foot tall men if you hang out in the FIU Basketball locker room, but such tall men aren’t really representative of the average height of FIU students. – If you look at a few hundred or so FIU students, you’ll see the normal curve start to emerge. Yes, you might get a few 6.5 students, but you’ll get a lot of 5.5 (and a few 4.5, too!) Normal Curve: Samples Vs. Populations Yet researchers typically assume that a study sample shares the same normal distribution characteristics of a larger population Whether dealing with a sample or a population, most scores tend to occur in the middle of the normal curve, and they have a higher probability of occurring than do extreme scores. The Normal Curve & SD We know a curve can be “normal”, but what’s even better is that a normal curve has a degree of standardization as well That word “standardization” leads us to a common descriptive statistic we learned all about in Chapter 3 (Salkind): Standard Deviations! In a normal curve, we can actually break down segments of the curve into “standard deviations”. Let’s say we have a raw mean score of 100 (the average score from participants is 100) with a range of scores from around 60 to 140. We have this curve … The Normal Curve With Standard Deviations Raw Scores In our normal curve, the scores of participants in the middle of the curve (mean = 100). Now consider a raw score (I will use this term “raw score” a lot. It is just an individual’s personal score, or X) A raw score could fall anywhere on the curve. For instance, a raw score of 110 would be here 1 SD Above The Mean – For raw scores from a mean of 100 to a mean score of 110, our standard deviation is 1 – Because 1 SD = 10. – The mean is at zero standard deviations because the mean is zero standard deviations away from itself… 1 SD Below the Mean In our normal curve, look at the scores of participants in the middle of the curve (mean = 100). – For raw scores from a mean of 100 to a mean score of 90, our standard deviation is –1 2 SDs Above The Mean – For raw scores from a mean of 100 to a mean score of 120, our standard deviation is 2 (including 100 to 110, and 110 to 120) – 100 + 10 = 110 and 110 + 10 = 120. 2 SDs Below The Mean More Normal Curve 101 In our normal curve, look at the scores of participants in the middle of the curve (mean = 100). – For raw scores from a mean of 100 to a mean score of 80, our standard deviation is –2 (including 100 to 90, and 90 to 80) More About SD & The Normal Curve – I could go on (nearly 100% of data scores fall within three standard deviations of the mean), but for now I want you to forget about raw scores altogether. Focus just on the SD. SD is a standard element of the normal curve – As long as the curve is “normal”, we know what specific percent of scores falls within each standard deviation. That is, from a standard deviation of 0 to a SD of 1, 34.13% of scores fall in that area of the normal curve. In other words … { 1 SD 1 standard deviation above the mean accounts for 34.13% of the scores in a distribution 1 SD { { 1 SD 1 standard deviation below the mean accounts for 34.13% of the scores in a distribution 1 SD { { 1 SD { 2 SD 2 standard deviations below the mean accounts for another 13.59% of the scores in a distribution 1 SD { { 1 SD 2 SD { { 2 SD 99. 74% 95. 44% { 68. 26% 34.13% 3 standard deviations above AND 3 below the mean accounts for 99.75% of all scores { More About The Normal Curve – If our mean is 100 and our standard deviation is 10, then 34.13% of scores will fall between 100 and 110 – If our mean is 100 and our standard deviation is 10, then an additional 13.59% of scores will fall between 110 and 120 So if I asked you for the total percentage of scores falling within two standard deviations above the mean, you just add 34.13% (for 1 SD) + 13.59% (for 2 SD) = 47.72% Finally, if you add all of the values on one side of the curve together, you will get 50% 34.13% + 13.59% + 2.15% + .13% = 50% What If The Mean is Different? What happens if our mean is 50? Well, it depends on what the standard deviation is. Let’s say the SD = 5. – 34.13% of the scores will fall between 50 and 55 (1 SD) – 34.13% of scores will fall between 45 and 50 (-1 SD) – 13.59 % of scores will fall between 55 and 60 (2 SD) etc. What if the mean is 1000 and the SD is 75? – 34.13% of the scores will fall between 1000 and 1075 (1 SD) – 34.13% of scores will fall between 925 and 1000 (-1 SD) – 13.59 % of scores will fall between 1075 and 1150 (2 SD) Percentages On The Curve The percentages in the normal curve are thus independent of the mean and standard deviations (normal curve percentages are thus the same no matter the specific SD or mean) Percentages Are Always The Same The percentages in the normal curve are thus independent of the mean and standard deviations Yet knowing the mean and standard deviation are important in understanding where specific scores fall on our normal curve But the mean needs to be dead center in the normal curve. If the curve is not bell-shaped (that is, if it is skewed, maybe because of extreme outliers that are high or low), we might have trouble adequately interpreting our statistics Example: Height In the United States, the average height of men is 70 inches (or Five Feet, 10 Inches). The standard deviation is 3 inches Thus, for one standard deviation from the mean … – 34.1% of men range from 70 to 73 inches (5.83 ft – 6.08 ft) – 34.1% of men range from 67 to 70 inches (5.58 ft to 5.83 ft) For two standard deviations from the mean … – 13.6% of men range from 73 to 76 inches (6.08 ft – 6.33 ft) – 13.6% of men range from 64 to 67 inches (5.33 ft – 5.58 ft) 95.44% of all men are + or – 2 SD of the mean, or 63 to 76 inches Example: Height Part II In the United States, the average height of men is 70 inches (or Five Feet, 10 Inches). The standard deviation is 3 inches Thus, for three standard deviations from the mean … – 2.1% of men range from 76 to 79 inches (6.33 ft – 6.58 ft) – 2.1% of men range from 61 to 64 inches (5.08 ft – 5.33 ft) For four standard deviations from the mean … – 0.13% of men range are above 80 inches (6.66 ft +) – 0.13% of men range are below 61 inches (5.08 ft – ) 99.6% of all men are + or – 4 SD of the mean, or 61 to 80 inches) Part Three Our Favorite Standard Score: The z Score The z Score We know distributions can have different measures of central tendency and different amounts of variability, so how can you compare two data sets with potentially different distributions? Welcome to the z Score, standard scores that are comparable because they are standardized in units of standard deviations That’s a mouthful, so let me explain … Why Do We Need A Standard Score? If we have a standard score, we can compare that score across two very different distributions, even if one set has a mean of 50 (and a SD of 10) and the other has a mean of 100 (and a SD of 5). There are lots of standard scores we can use, but we will stick with the z Score, which is essentially the result of dividing the amount that a raw score differs from the mean of the distribution by the standard deviation. Let’s see our formula! The z Score Formula 𝑥 − 𝑥ҧ 𝑧= 𝑠 z is the z Score x is the individual score (the raw score) 𝑥ҧ is the mean of the distribution s is the distribution standard deviation Step 1: Calculate the SD The formula isn’t all that scary for you I hope. Just subtract the mean score from the individual raw score and divide by the standard deviation. – The difficult part, of course, is calculating the s (Standard Deviation). Make sure to review Chapter 3 (Salkind) if you need a refresher on the standard deviation formula Plug in the “time to help” score for each of our clerks and find the standard deviation. I’ll wait … Still waiting … Okay, time to move forward … Step 2 – Let’s think about our clerks in the deaf condition. Imagine we have a mean time of 4.00 minutes and now we know the standard deviation of 2.50 minutes. Step 2: subtract the mean (4.00) from each clerk’s original score. If the clerk’s raw score was 4, then 4 – 4 = 0. If the clerk’s raw score was 8, then 8 – 4 = 4, etc. … Step 2: Deaf Group (M = 4.00, SD = 2.50) Raw Score (x) 4 8 2 4 3 5 1 3 x – 𝑥ҧ 4–4=0 8–4=4 2 – 4 = -2 4–4=0 3 – 4 = -1 5–4=1 1 – 4 = -3 3 – 4 = -1 (x – 𝑥ҧ ) / SD z Score Step 3 Third, divide that original score by the SD (2.50) If an original (X – X) = 0, then 0 / 2.50 = 0. If an original (X – X) = 4,then 4 / 2.50 = 1.60, etc. This gives us our z Score for EACH original scores Step 3: Deaf Group (M = 4.00, SD = 2.50) Raw Score 4 8 2 4 3 5 1 3 X–X 4–4=0 8–4=4 2 – 4 = -2 4–4=0 3 – 4 = -1 5–4=1 1 – 4 = -3 3 – 4 = -1 (X – X) / SD 0 / 2.5 4 / 2.5 -2 / 2.5 0 / 2.5 -1 / 2.5 1 / 2.5 -3 / 2.5 -1 / 2.5 z Score 0 1.6 -0.80 0 -0.40 .40 -1.20 -0.40 More Z-Score Practice! – So now we have z Scores for each clerk (based on a mean time of 4 minutes and a standard deviation of 2.5 minutes) in our deaf customer condition – Let’s calculate the z Score for our hearing clerks now For hearing participants, our mean = 1.9 minutes and our SD = 1.2 minutes The z Score – Hearing Group Raw Score X–X .5 .5 – 1.9 = -1.4 2.5 2.5 – 1.9 = 0.6 1.8 1.8 – 1.9 = -0.1 1 1 – 1.9 = -0.9 2.38 2.38 – 1.9 = 0.48 1.5 1.5 – 1.9 = -0.4 2.2 2.2 – 1.9 = 0.3 .75 .75 – 1.9 = -1.15 (X – X) / SD -1.4 / 1.2 0.6 / 1.2 -0.1 / 1.2 -0.9 / 1.2 0.48 / 1.2 -0.4 / 1.2 0.3 / 1.2 -1.15 / 1.2 z Score -1.66 0.50 -0.08 -0.75 0.40 -0.33 .25 -0.96 Z Scores Are Awesome! Z-scores are great because they allow us to do two things: 1) Compare scores from different distributions (we can compare scores from the hearing group and the deaf group, even though they come from different distributions). 2) See how each score compares to the mean (and thus the rest of the people in a distribution) – The z-score is in “standard deviation” units, so a z-score of 1 means you are 1 Standard Deviation away from the mean. This means it’s pretty close to the average. – If your z score is 5, this means you are 5 standard deviations away form the mean! Wow! That’s really really far from the mean. This score is VERY different! Let’s explore how this plays out in another example…. SAT Vs. ACT? Let’s say you are junior in high school—you just took the SAT and got a 1010. Your friend just took ACT and got a 30 (the ACT is a similar, but different, standardized test that some colleges require). You couldn’t compare your raw SAT score to your friend’s raw ACT score because they come from different distributions! They have different means, SDs, and total point values. How does a 1010 compare to a 30? How can you know who did better on their test? If you said “look at the z-scores” you’d be right! (nice job!). We have to convert these raw scores to z-scores to compare them! Let’s try it…. The z Score Allows Us to Compare Your raw score on the SAT: 1010 – The national average is 1051 with a SD of 211 z = (1051 – 1010)/211 = -0.19. 𝑥 − 𝑥ҧ 𝑧= 𝑠 In other words, your SAT score was .19 standard deviations BELOW the mean for the SAT Your friend’s raw score on the ACT: 30 – The national average is 20.8 with a SD of 5.8 z = (30-20.8)/5.8 = 1.58 In other words, your friend’s score was 1.58 standard deviations ABOVE the mean for the ACT. So, who did better on their test, you or your friend? The z Score Example Comparisons So, your friend did better on the ACT than you did on the SAT because their score was 1.58 standard deviations above the mean. Your score was .19 standard deviations below the mean. Does this mean you should be sad about your score? No! Not necessarily! As you’ll see soon, z = 0.19 is actually very close to the average (even though it’s just slightly below the average). Thus, the z-score allows us to compare scores from different distributions (by looking at how many standard deviations each score is from the mean of that distribution). Wait, before we move on, how did we know that your score was below the mean, but your friend’s score was above the mean? Think about it for a second before you go to the next slide… The Sign (+ or -) Matters! Negative z-scores indicate that particular score is below the mean (average). Positive z-scores indicate that particular score is above the mean (average). So, if you have a z-score of +1, then 84.13% of scores fall at or below that level. Huh? How did I get that 84.13%? Well … How Did We Get That Answer? We can use what we know about the z-scores and the properties of the normal curve to figure out that 84.13% of scores fall below a z Score of +1. – 50% of scores fall below the mean, right? – Another 34.13% of scores fall between the mean and +1 z – Thus 50% + 34.13% = 84.13%, so 84.13% of all scores in a normal curve fall below a z score of +1 Conversely, 84.13% of all scores in a normal curve fall above a z Score of -1 What % of scores are above +1 z Score? • How many scores are above +1 z Score? Easy! It’s 15.87% That is, 100% – 84.13% = 15.87% This also means that the probability of a z Score being above +1 is thus 15.87%, which is pretty rare, right? (It only happens 15.87% of the time after all!) How Likely This Z-Score? A score falling within this 15.87% is unlikely. In fact, it’s a lot more likely that a raw score will be below a z Score of +1 – Of course it is even harder for a score to fall within the 5% range (the z Score is actually 1.65, which is hard to reach!) – Yet in research, we use a p value of .05 (or 5%) to say that something is significant. If we get a z Score above 1.65, it’s highly unlikely that it occurred by chance (there is only a 5% likelihood that it is due to chance alone). We’ll discuss this idea of significance testing a lot more in Chapter 9 (Salkind) z Scores Are Equal to SDs Whole number z Scores are easy to think about in terms of a normal curve. +1 z Score is equivalent to 1 SD (both represent 34.13% of the normal curve), just as a +2 z Score is the same as +2 SD (both represent another 13.59% of the normal curve) Consider a graph with both z Scores and SDs … The z Score Both 1 SD and 1 z represent 34.13% of all scores What If The z Score Has A Decimal? What happens when our z Score is not a whole number? That is, what if we have a z Score of something like 1.22? – As Salkind notes, you can learn calculus to figure out the exact area in the normal curve that corresponds to each possible decimal-point oriented z Score, or you can just look in Appendix B, Table B1 (Salkind)! z Table (Salkind, Appendix B1) – 1. In Appendix B, first look for the z Score column – 2. Next, look for the column “Area Between the Mean and the z Score” – 3. Find your obtained z Score in the first column and then see how much of the area under the normal curve is between the mean and that z score. Reading The Z Table It tells you the area (% of people in the population) between the mean (in the middle) and any z score! z Table Quick Example – Let’s start (somewhat) easy. Look for a the area between the mean and z of 0.99 and the area between the mean and z of 1.01 in Appendix B1. (no need to worry about the + or – just yet, as the normal curve is symmetrical and the left side scores mimic the right side scores) The Area for 0.99 is 33.89 The Area of 1.01 is 34.38 This make sense, because we already know that the percentage of scores between the mean and a z of 1 is 34.13%! Area Between the Mean and Z = -2.33 – Let’s try a harder one. What if we have a z Score of -2.33? Then, the area between the mean and z is 49.01! This purple area represents the area between z -2.33 and the mean. The table tells us that this area is 49.01% The z Score Wait, wait! Hold on! How did we know what the area was for z = 2.33 when all the values in the table are positive? – Vs + z Scores In The Table Remember the table tells us the area between the mean and z… Recall also that the mean is in the exact middle of the curve and that the curve is perfectly symmetrical on both sides. So the distance from any positive z value to the mean is the same the distance between any negative z value and the mean! For example, the area (percentage) between the mean and z = 1 is 34.13% and the area between the mean and z = -1 is 34.13% Let’s Try Another Example… Okay, so what if I wanted to know the percentage of people (the percentage under the curve) that is above (i.e. higher than) a z-score of -2.33? Finding The Area Above -2.33 – What if I wanted to know the percentage of people (of the population under the curve) that is above this z score? The table tells us that this area is 49.01% And we know that this entire half of the curve is 50%! (everything above the mean is 50%) Finding The Area Above -2.33 – So, 49.01 + 50 = 99.01% (purple area + red area) – In other words, 99.01% of the population is higher (above) a z-score of -2.33 (the purple area + the red area) The table tells us that this area is 49.01% And we know that this whole half of the curve is 50%! (everything above the mean is 50%) This Can Tell Us About Probablity! – We could also say that there is a 99.01% chance that any given person’s score is higher than z = -2.33. – Or, if we randomly selected a person, 99.01% of the time, they would have a z-score higher than -2.33. The table tells us that this area is 49.01% And we know that this whole half of the curve is 50%! (everything above the mean is 50%) Lets Do Another One…. What percent of scores fall BELOW a z score of 1.96? Lets Do Another One…. What percent of scores fall BELOW a z score of 1.96? This whole purple area is what we’re looking for. The area under the curve (or the percentage of the whole population) that is below a z = 1.96 Lets Do Another One…. What percent of scores fall BELOW a z score of 1.96? The table tells us that the area between the mean and z is 47.5 Lets Do Another One…. What percent of scores fall BELOW a z score of 1.96? We know that everything below the mean is 50% (the red area = 50%) Lets Do Another One…. What percent of scores fall BELOW a z score of 1.96? The table tells us that this area between the mean and z is 47.5 We know that everything below the mean is 50% (the red area = 50%) 47.5 + 50 = 97.5% Lets Do Another One…. What percent of scores fall BELOW a z score of 1.96? ANSWER: 97.5% of the population falls below a z-score of 1.96 (red + purple) 47.5 + 50 = 97.5% Pop-Quiz 1: Quiz Yourself Using Appendix B, what percent of scores fall ABOVE a z score of 0.98? A). 34.14% B). 83.65% C). 16.35% D). 33.65% Answer 1: C Using Appendix B, what percent of scores fall ABOVE a z score of 0.98? A). 34.14% B). 83.65% C). 16.35% D). 33.65% Let’s see how we got that answer…. This is what we’re trying to find, the percent ABOVE a z score of .98 The table in the appendix tells us that this area = 33.65 (between the mean and z score of .98) The table tells us this area = 33.65% And we know that this entire half (the red) is 50% The whole red area minus the blue area will give us the purple…. 50-33.65 = 16.35% The table tells us this area = 33.65% And we know that this entire half is 50% In other words, the area above the z of .98 (in purple) is: 50-33.65 = 16.35% I know these drawings seem a bit messy, but I want you to practice doing this too! Whenever you see a problem like this, draw it out! Your drawings don’t have to be perfect, but practice drawing a normal curve (with the mean in the center) and coloring in the area that you are trying to find! When it doubt….Draw it out! Draw the area you are looking for (what do you need to know?) Draw the areas you know (50% on each half, or areas that you can look up in the table in the book). Take A Break! Are you still with me? If not, here’s some cute puppies for you to look at for a moment, so relax! Take a break for a minute! No I mean it—go get a cup of coffee or a glass of water and then come back! Don’t worry…I’ll wait…. Ready? Okay? Feeling better? Good! Let’s keep going! Finding The Percent Between Two Zs – Now let’s say we have two z Scores and we want to figure out the probability that a third score will fall between these two scores. What do we do? In other words, we want to know the area of the curve between those two scores. Percent Between: Example 1 – If you already know the z Scores, it’s pretty simple. Find the area each z Score represents in Appendix B and subtract the smaller area from the larger area If one z Score is 1.5 (area is 43.32) and the other is 0.24 (area is 9.48), then 43.32 – 9.48 = 33.84 Thus the chances of one score falling between these two scores is 33.84% If this isn’t clear, don’t worry, we’ll do a lot of examples to help practice this! Percent Between: Example 2 – Example: let’s say we have two very competitive high schoolers and they’re comparing SAT scores. Sam got a 1200 and Beth got a 1350. They want to know what percentage of people got a score between their two scores. – A 1200 corresponds to a z score of .717 – A 1350 corresponds to a z score of 1.48 If you’re having trouble figuring out how I got these z-scores, scroll back up through the PowerPoint to the section calculating z scores ☺ Let’s break this problem down into steps! Step 1: Draw a normal curve: z-score Step 2: Figure out the area on the curve that we want to know. Area (number of people) between a z = .71 and z = 1.48 Step 2: Figure out the area on the curve that we want to know. Area (% of people) between a z = .71 (Sam) and z = 1.48 (Beth) Sam Beth Step 3: Look up the areas between the mean and each of those z scores in the z table. Sam Beth Step 3: Look up the areas between the mean and each of those z scores in question Sam Beth Area between m and Sam’s z (.71) = 26.11% Step 3: Look up the areas between the mean and each of those z scores in question Area between m and Beth’s z (1.48) = 43.06% Sam Beth Area between m and Sam’s z (.71) = 26.11% Step 3: Look up the areas between the mean and each of those z scores in question Area between m and Beth’s z (1.48) = 43.06% Sam Beth We want to know the purple area Area between m and Sam’s z (.71) = 26.11% Step 4: Find the area between those two areas by subtracting the larger one from the smaller one. Beth’s area – Sam’s area = area in between. 43.06 – 26.11 = 16.95% Area Areabetween betweenmmand and z == 43.06% Beth’sBeth’s z (1.48) Sam Beth We want to know the purple area Area between m and Sam’s z (.71) = 26.11% Beth’s area – Sam’s area = area in between. 43.06 – 26.11 = 16.95% 16.95% of people got SAT scores between Beth and Sam’s scores (between 1200 and 1350) Sam Beth Rule of Thumb (1 of 2): If the two z-scores are on the same side of the mean (both positive or both negative) you subtract the areas between the mean and each z to get the area between. These two z scores are on the same side of the curve (both negative), so you would subtract the areas between the mean and z to get the area between Rule of Thumb (2 of 2): If the two z scores are on opposite sides of the mean (one is positive and the other is negative) you add the areas from the table to get the area between. These two z scores are on opposite sides of the curve (one is positive and the other is negative), so you would add the areas between the mean and z to get the area between Let’s Practice: What percentage of people fall between a z score of -1.3 and -2.1? Answer on the next slide….. Let’s Practice: Answer ANSWER: The z scores are both negative so you subtract the areas. Z = -1.3, area between mean and z: 40.32% Z = -2.1, area between mean and z: 48.21% 48.21 – 40.32 = 7.87% What If You Only Have The Raw Score? – If you only know the raw scores, go back to the formula to find their corresponding z Score 𝑥 − 𝑥ҧ 𝑧= – Let’s say we have scores 234 and 122 𝑠 – SD is 111 and the mean is 186 z for “234” is (234 – 186) / 111 = .432 … area = 16.64 z for “122” is (122 – 186) / 111 = .577 … area = 21.57 – Thus 21.57 – 16.64 = 4.93% A score will occur between these scores about 4.93% More Examples With Raw Scores 𝑥 − 𝑥ҧ – Let’s say we have scores 110 and 125 𝑧= 𝑠 – SD is 10 and the mean is 100 z for “110” is (110 – 100) / 10 = 1.00 … area = 34.13 z for “125” is (125 – 100) / 10 = 2.50 … area = 49.38 – Thus 49.38 – 34.13 = 15.25% There is a 15.25% chance a score falls between these two scores. A Z Score of Zero Finally, what does a z Score of zero (0) imply? – Simple: a zero z Score is the mean (thus if the raw score is the mean, the z Score is automatically a zero). – The further a z Score gets from the mean, the more extreme the value! – A z Score of –2 is more extreme than a z Score of 1 Why Is The Z Score Important? What z Scores Really Represent Okay, so in statistics our goal is to use some criterion to judge whether we think an event is as likely, more likely, or less likely than what we expect by chance The probability of an event occurring is the important aspect. That is, we predict whether an event will occur, and we assess whether this likelihood is greater than chance The z Score & Probability Consider the coin toss example in Chapter 8 (Salkind). If you flip a coin 10 times, how often should it come up heads? If the coin is “fair”, we’d expect it to come up heads 5 times and tails 5 times (provided one coin flip is independent from others) So what would be unfair? 6 heads? 7? … 8? … 9? …10? In probability testing, we must set the standard for what we consider “fair”, and in psychology we generally rely on 5% Statistical Significance Psychologists use a p value set to less than .05 for statistical significance – That is, any event that occurs by chance alone 5 times or less in 100 occasions is “rare”. In essence, we “allow” up to 5% error in our study and still conclude that it is significant. Psychologists deem this “rare” 5% or less occurrence as acceptable for significance In our coin example, we look at all of the possible coin flips and assess how fair the result is. 10 heads (0 tails). 9 heads (1 tail). 8 heads (2 tail) etc. Each has a probability associated with it Number of Heads In 10 Flips 0 1 2 3 4 5 6 7 8 9 10 Probability 0.00 0.01 0.04 0.12 0.21 0.25 0.21 0.12 0.04 0.01 0.00 How Rare Is That Outcome? The probability here is based on 10 coin flips and all possible combinations (1024 of them!) Psychologists tend to be conservative. We’ll say an outcome is unlikely if it occurs less than 5% of the time. That is, we’ll call the coin rigged if an unexpected outcome (like almost all heads on ten flips) occurs 95 times out of 100 – In other words, if the outcome is so rare that it occurs less than 5% of the time, we can conclude that it is so unlikely that something other than mere chance is responsible for the outcome. Like what? Well, maybe a bogus coin! How Likely Is That Outcome (2)? If we look at our table, the MOST likely outcome for 10 coin flips is 5 heads / 5 tails – This occurs 25% of the time by chance The next most likely is both 4 heads / 6 tails, which occurs 21% of the time – Add them all up and you get 100% (.00 + .01 + .04 + .12 + .21 + .25 + .21 + .12 + .04 + .01 + .00 = 100%) While 7 heads (3 tails) or 3 heads (7 tails) are unlikely (both occur 12% of the time), they are still possible, right? Rare Outcomes But what about 8 heads or 2 heads? Those occur only 4% of the time. Pretty rare, so something might be wrong with that coin! Same with 9 heads or 1 head, which occur only 1% of the time And 10 heads / 0 heads? Virtually impossible (I rounded to 0!) If any occur (each with a probability less than 5% of the time), something fishy must be going on! Thus 8, 9, or 10 tosses that land “heads’ up” is lower than we would expect by chance. But 4, 5, 6, or 7 heads occurs within chance The z Score Tells Us About Probability! Just how extreme would a score need to be to qualify as more likely than chance alone? – Just like the probability of randomly selecting someone from the population that has a z score that is bigger than 2 is very small! Because we know those are rare scores, based on the shape of the normal distribution. – In fact, according to the table in Appendix B, any z Score of 1.65 or higher is pretty rare and happens roughly 5% of the time just by chance. – More on this later, but this is where we get our 5% from when we talk about statistical significance: The probability that an effect would occur just by chance! If it’s less than 5%, then it’s significant! Brief Summary: Z scores are a standard way of representing each score (how many standard deviations is each person from the mean?). We can use the properties of the normal curve to understand how each score in the distribution relates to other scores in the distribution. Use the z score table to figure these out! This can give us insight into the probability of selecting any one of those scores (given how “rare” or “common” the score might be). We use the standard of p < .05 in psychology as our baseline for “acceptable” risk when hypothesis testing. A z score of 1.96. One Last Practice Example! Amelia needs to score in the top 10% of this methods class in order to get on the Dean’s list. The class mean is 85% and the standard deviation is 3.5. What raw score does Amelia need in order to get on the Dean’s list? NOTE: This question is similar to one in your example problems at the end of the Salkind chapter. Try to do that one on your own if you need to puzzle out another example!
Purchase answer to see full attachment