Psychology Computing Correlation Coefficients Questions

Psychology Computing Correlation Coefficients Questions

Description

 

 

Two Miami Heat basketball fans are arguing about whether older players are more skilled in basketball or whether younger players are more skilled. One friend thinks that older players are actually more efficient and that they’re more likely to make 3 pointers. They decide to look at a correlation to test this out. They look specifically at the correlation between player’s age and the percentage of 3pointers (calculated by the number of 3 point shots made divided by the number of field goals attempted–with a higher score indicating more shots made).

HEAT PLAYERS Age (x) 3Pointer Percentage (y)

Bam Adebayo 25 8.3

Caleb Martin 27 35.6

Cody Zeller 30 0

Dewayne Dedmon 33 29.7

Dru Smith 25 16.7

Duncan Robinson 22 32.8

Gabe Vincent 26 33.4

Haywood Highsmith 26 33.9

Jamal Cain 24 35

Jamaree Bouyea 23 40

Jimmy Butler 33 35

Kevin Love 34 29.7

Kyle Lowry 37 34.5

Max Strus 27 35

Nikola Jovic 20 22.9

Omer Yurtseven 24 42.9

Orlando Robinson 22 0

Tyler Herro 23 37.8

Udonis Haslem 43 33.3

Victor Oladipo 31 33

 

Unformatted Attachment Preview

Chapter Five (Salkind) Ice Cream and Crime: Computing Correlation Coefficients Does eating ice cream cause crime?  Ice Cream and Crime During our last lecture, I asked you to ponder this odd finding: As the amount of ice cream people eats increases, the crime rate also increases! I then posed this question: – Does eating ice cream cause crime? An Overview of This Chapter I am hoping you can already guess the answer – NO However, there is a strong relationship between the two. That is, they are co-related, which brings us to our current chapter on co-relations (or correlations)! In essence, a correlation means that as the value of one thing changes, the value of another thing also changes We express this relationship through a correlation coefficient We Will Cover The Following Items Part One: What Are Correlations All About? Part Two: Computing A Simple Correlation Coefficient Part Three: Understanding What The Corr. Coefficient Means Part Four: A Determined Effort: Squaring the Corr. Coefficient Part Five: Other Cool Correlations Part Six: Using The Computer To Compute A Corr. Coefficient Part Seven: An Eye Toward The Future Part One What Are Correlations All About? What Are Correlations All About? Our first task in this chapter is defining and understanding the nature of a correlation coefficient. This is simply a numerical index that reflects the relationship between two variables – This descriptive statistic (remember what those are?) has a range from -1 to +1 -1 indicates a perfect negative correlation (that is, as one variable increases, another decreases) +1 indicates a perfect positive correlation (as one variable increases, the other increases; also, as one decreases, the other decreases) The Pearson Correlation – NOTE: You might also see this called a bivariate correlation (a correlation between two variables, hence the “bi” phrase) – We’re going to talk a lot about one specific correlation in this chapter: The Pearson Product Moment Correlation Pearson is used for continuous variables (ratio or interval scales), but other correlational tests are available for any categorical variables (nominal and ordinal) Positive And Negative Correlations  Types of Correlation Coefficients: Flavor 1 and Flavor 2 We noted that correlations can be positive or negative, but set those terms aside for a moment in favor of the terms that your textbook author Salkind prefers: “direct” and “indirect” – A “direct” correlation (the positive one) means that the variables change in the same direction (both go up or both go down) – An “indirect” correlation (the negative one) means that variables change in the opposite direction I use these names interchangeably, so just be aware that the terms (positive = direct; negative = indirect) are synonymous What can correlations do? It is important to note that when we look at correlations, we are looking at variables and not looking specifically at individuals – Note that while we can (and do) make predictions regarding how people on average may respond, but we do not predict how a specific, unique individual will respond – The same thing goes for correlations. We can predict how variables among groups of people correlate, but we should not focus on individuals scores Properties Of A Correlation  Things To Keep In Mind 1. A correlation ranges from –1 to +1 2. The closer a correlation is to –1 or +1, the stronger it is. Thus a .70 is stronger than a .45, but a -.77 is stronger than a .70. The sign itself is irrelevant in terms of strength! 3. A correlation represents the relationship between AT LEAST TWO variables. 4. A negative correlation is neither good nor bad on its own, just as a positive correlation is neither good nor bad. It only tells you the direction of the correlation (direct or indirect) 5. Pearson is represented by the lower case letter r What is a correlation of zero? So what does a correlation of zero tell you? – It’s actually pretty easy: If r = 0, then there is no relationship between the variables! As one variable increases … well, nothing happens to the other variable Things To Keep In Mind: A correlation also needs room to vary. If one variable is static (the same across all participants), then changing another will not effect the correlation – If you all get 100% on the next exam and I correlate that with the hours you spent studying, we won’t get a correlation. Study time won’t correlate if you all get 100% Pop-Quiz 1: Quiz Yourself  This is another word for a negative correlation:  A). Direct correlation  B). Weak correlation  C). Strong correlation  D). Indirect correlation Answer 1: D  This is another word for a negative correlation:  A). Direct correlation  B). Weak correlation  C). Strong correlation  D). Indirect correlation Part Two Computing A Simple Correlation Coefficient Computing a Correlation Coefficient  Computing A Simple Correlation Coefficient Now for the fun part. Are you ready for our next stomach-ache inducing formula? Here it is 𝑟𝑥𝑦 = 𝑛∑𝑥𝑦 − Σ𝑥Σ𝑦 𝑛Σ𝑥 2 − Σx 2 [𝑛Σ𝑦 2 − Σy 2 Don’t worry! Like our prior chapters, let’s walk our way through this formula, once again plugging in numbers Correlation Coefficient Formula 𝑟𝑥𝑦 = 𝑛∑𝑥𝑦 − Σ𝑥Σ𝑦 𝑛Σ𝑥 2 − Σx 2 [𝑛Σ𝑦 2 − Σy 2 rxy is the correlation coefficient (outcome – our goal) of X and Y n is the sample size ∑ is the “Sum of” sign X is the individual’s score on the X variable Y is the individual’s score on the Y variable XY is the product of each X score times its comparable Y value X2 is the individual’s X score, squared Y2 is the individual’s Y score, squared Example Study Set Up Imagine we conduct a study looking at the correlation between teen relationship “satisfaction” and features of that relationship – We could look at several different features, like … “Personal Growth”, “Appreciation”, “Exhilaration and Happiness”, Painfulness and/or Emotional Turmoil”, “Passion and Romance”, “Emotional Support”, “Good Communication”, and “Togetherness” – For now, let’s focus on “Satisfaction and Happiness” – Let’s say satisfaction (X) ranges from 1 (low) to 9 (high) – Let’s say happiness (Y) also ranges from 1 (low) to 9 (high) Data Set Up One participant has an X score of 4 and a Y score of 3. Another participant has an X score of 6 and a Y score of 7. And so forth for seven total participants … Participant X (Satisfaction) Y (Happiness) 1 4 3 2 6 7 3 3 3 4 4 5 5 7 6 6 3 2 7 5 7 Data Set Up: The Handy Table Whenever you have to calculate a correlation coefficient, I suggest you make a table that looks like this. It will help to organize your data a little… Σ (Sum) Subject X Y X2 Y2 XY 1 4 3 12 2 6 7 42 3 3 3 9 4 4 5 20 5 7 6 42 6 3 2 6 7 5 7 35 32 33 166 Plugging in n (the sample size) 𝑟𝑥𝑦 = 𝑛∑𝑥𝑦 − Σ𝑥Σ𝑦 𝑛Σ𝑥 2 − Σx 2 [𝑛Σ𝑦 2 − Σy 2 n is 7 (seven total participants, each providing X and Y scores) We’ll eventually plug “7” into all of these “n’s” The Sum of X Σx 𝑟𝑥𝑦 = 𝑛∑𝑥𝑦 − Σ𝑥Σ𝑦 𝑛Σ𝑥 2 − Σx 2 [𝑛Σ𝑦 2 − Σy 2 n is 7 ΣX is the sum of all X scores from all seven participants. That is, 4 + 6 + 3 + 4 + 7 + 3 + 5 = 32) The Sum of Y Σy 𝑟𝑥𝑦 = 𝑛∑𝑥𝑦 − Σ𝑥Σ𝑦 𝑛Σ𝑥 2 − Σx 2 [𝑛Σ𝑦 2 − Σy 2 n is 7 ΣX is 32 ΣY is the sum of all Y scores from all seven participants. That is, 3 + 7 + 3 + 5 + 6 + 2 + 7 = 33 The Sum of XY: ∑𝑥𝑦 𝑟𝑥𝑦 = 𝑛∑𝑥𝑦 − Σ𝑥Σ𝑦 𝑛Σ𝑥 2 − Σx 2 [𝑛Σ𝑦 2 − Σy 2 n is 7 ΣX is 32 ΣY is 33 XY is each subject X score times their own Y score. e.g. XY for Subject 1 is 4 X 3 = 12; XY for Subject 2 is 6 X 7 = 42; etc. – ΣXY is 166! Computing ∑𝑥𝑦 Σ (Sum) Subject X Y X2 Y2 XY 1 4 3 12 2 6 7 42 3 3 3 9 4 4 5 20 5 7 6 42 6 3 2 6 7 5 7 35 32 33 166 2 2 The Sum of X : ∑𝑥 𝑟𝑥𝑦 = 𝑛∑𝑥𝑦 − Σ𝑥Σ𝑦 𝑛Σ𝑥 2 − Σx 2 [𝑛Σ𝑦 2 − Σy 2 n is 7 ΣX is 32 ΣY is 33 ΣXY is 166 X2 is each X value squared. 2 Computing ∑𝑥 Σ (Sum) Subject X Y X2 Y2 XY 1 2 3 4 5 6 4 6 3 4 7 3 3 7 3 5 6 2 16 36 9 16 49 9 12 42 9 20 42 6 7 5 7 25 35 32 33 160 166 2 2 The Sum of Y : ∑𝑦  Computing A Simple Correlation Coefficient 𝑟𝑥𝑦 = 𝑛∑𝑥𝑦 − Σ𝑥Σ𝑦 𝑛Σ𝑥 2 − Σx 2 [𝑛Σ𝑦 2 − Σy 2 n is 7 ΣX is 32 ΣY is 33 ΣXY 166 X2 is 160 Y2 is each Y value squared. S1 is 3 X 3 = 9; S 2 is 7 X 7 = 49, etc. 2 Computing ∑𝑦 Σ (Sum) Subject X Y X2 Y2 XY 1 4 3 16 9 12 2 6 7 36 49 42 3 3 3 9 9 9 4 4 5 16 25 20 5 7 6 49 36 42 6 3 2 9 4 6 7 5 7 25 49 35 32 33 160 181 166 Plugging in Values 𝑟𝑥𝑦 = n is 7 ΣX is 32 ΣY is 33 ΣXY is 166 ΣX2 is 160 ΣY2 is 181 𝑟𝑥𝑦 = 𝑟𝑥𝑦 = 𝑛∑𝑥𝑦 − Σ𝑥Σ𝑦 𝑛Σ𝑥 2 − Σx 2 [𝑛Σ𝑦 2 − Σy 2 7 166 − 32(33) 7(160) − (32)2 [7(180) − 33 2 1162 − 1056 1120 − 1024 [1267 − 1089] We’ll carry this one to our next page … Still Plugging In Values… 𝑟𝑥𝑦 = n is 7 ΣX is 32 ΣY is 33 ΣXY is 166 ΣX2 is 160 ΣY2 is 181 1162 − 1056 1120 − 1024 [1267 − 1089] 106 𝑟𝑥𝑦 = 𝑟𝑥𝑦 = [96][178] 106 17088 106 𝑟𝑥𝑦 = 130.72 𝑟𝑥𝑦 = 0.8108 Phew! We’re done! What does it mean? Simple, right! Our correlation between relationship satisfaction and teenager happiness in that relationship is .811 (rounded from .8108). But is that a significant relationship? – We know that the closer you are to –1 or +1 the stronger the relationship, so where does .811 fall? – Very strong relationship, right! Here’s a chart for a good rule of thumb about the strength of a correlation … Interpreting a Correlation Coefficient Size of the Correlation .8 to 1.0 (+ or –) .6 to .8 (+ or –) .4 to .6 (+ or –) Coefficient General Interpretation Very Strong Relationship Strong Relationship Moderate Relationship .2 to .4 (+ or –) .0 to .2 (+ or –) Weak Relationship Weak or No Relationship So our correlation of .818 is “very strong” here, and it a direct (positive!) correlation. Pop-Quiz 2: Quiz Yourself  Which of the following correlations would be interpreted as a weak relationship? A). -.26 B). -.46 C). -.66 D). -.86 Answer 2: A  Which of the following correlations would be interpreted as a weak relationship? A). -.26 B). -.46 C). -.66 D). -.86 The Scatterplot One way to present a correlation is through the correlation coefficient, as we just saw. But an alternative is through a scatterplot (scattergram), which plots each set of scores on separate axes 1. Draw the x-axis (horizontal) and the y-axis (vertical) 2. Mark each axis with the range of values in the data set – In our relationship satisfaction example, our range is 1 to 9 3. Pair up the two scores from each participant and find where they intersect on the scatterplot Data for a Scatterplot Once again, we have our Satisfaction and Happiness scores Participant 1 2 Satisfaction Score (X) 4 6 Happiness Score (Y) 3 7 3 4 5 3 4 7 3 5 6 6 7 3 5 2 7 Happiness Example Scatterplot Satisfaction Points On The Scatterplot • Let’s find this data point on our plot. Each point on the plot corresponds to a participant’s x and y score Participant 1 2 Satisfaction Score (X) 4 6 Happiness Score (Y) 3 7 3 4 5 3 4 7 3 5 6 6 7 3 5 2 7 Computing A Simple Corr. Coefficient Happiness  A Visual Picture Of A Correlation: The Scatterplot Participant # 4 Over 4, up 5 Satisfaction A Perfect Correlation Of course, it would be nice to get a perfect correlation (as one variable changes, the other changes at an equal rate), but this rarely happens. But imagine what a perfect correlation would look like. As one variable increased, another increased at the same rate (a direct—or positive—correlation). – First, consider the table we would get if we used our relationship satisfaction study … Data Representing Perfect Correlation Participant 1 2 Satisfaction Score (X) 4 6 Happiness Score (Y) 4 6 3 4 5 3 2 7 3 2 7 6 7 8 5 8 5 Scatterplot of a Perfect Correlation Y X The Positive Correlation This is a perfect positive correlation, which means that as satisfaction increase, so does happiness. – It also implies that as happiness decreases, relationship satisfaction also decreases (if happiness is a 2, satisfaction in a 2; if happiness is an 8, satisfaction is an 8) – This is why both variables increasing or both variables decreasing are considered direct (i.e. positive) correlations. Now, let’s consider a perfect negative correlation. As one variable increases, the other decreases … A negative correlation Participant 1 2 Satisfaction Score (X) 4 6 Happiness Score (Y) 6 4 3 4 5 3 2 7 7 8 3 6 7 8 5 2 5 Scatterplot of a Negative Correlation Y X Curvilinear Relationships Perfect correlations (which can be represented by a straight line on which all data points align) are very rare, especially in psychology. Our original relationship satisfaction scatterplot is much more likely Yet some correlations are curvilinear in nature. For example, memory scores might be really low at both young and old ages but high in middle adulthood. Consider this memory scatterplot (X is age and Y is memory score) Curvilinear Relationship Example Data Consider Age (10 to 70) and Memory (0 to 70) scores Participant 1 2 Age (X) 10 20 Memory (Y) 0-100 18 35 3 4 5 30 40 50 47 52 47 6 7 60 70 35 18 Curvilinear Scatterplot Y X Showing Many Correlations at Once In our relationship satisfaction example, we looked at two variables: satisfaction and happiness. But as you saw, there are a lot of variables we could correlate with relationship satisfaction and with each other. – “Personal Growth”, “Appreciation”, “Painfulness and/or Emotional Turmoil”, “Passion and Romance”, “Emotional Support”, “Good Communication”, and “Togetherness” When there are multiple variables you want to correlate, you can use a correlation matrix The Correlation Matrix Satisfaction Appreciation Pain Romance Satisfaction 1.00 Appreciation .543 1.00 Pain -.342 -.122 1.00 Romance .232 .564 -.234 1.00 Growth .342 .654 -.178 .325 Growth 1.00 A Few Things to Note about the Matrix: 1. You might note that there are lots of “perfect” direct correlations (that is, lots of 1.00 correlations)  Look closer: You’ll see that those relationship features correlate with themselves! Of course “Satisfaction” correlates perfectly with “Satisfaction”! It’s the same variable. Satisfaction Appreciation Pain Romance Satisfaction 1.00 Appreciation .543 1.00 Pain -.342 -.122 1.00 Romance .232 .564 -.234 1.00 Growth .342 .654 -.178 .325 Growth 1.00 Understanding the Correlation Matrix 2. You also see a lot of positive correlations. For example, as satisfaction increases, so does appreciation (.543), romance (.232) and growth (.342) 3. But you’ll also see some negative correlations, most of which involve pain (-.342 correlation with satisfaction; -.122 correlation with appreciation; and -.234 when correlated with romance; and -.178 correlation with growth). – This makes sense, right? As pain increases, the correlation with more happy relationship features decreases Satisfaction Appreciation Pain Romance Satisfaction 1.00 Appreciation .543 1.00 Pain -.342 -.122 1.00 Romance .232 .564 -.234 1.00 Growth .342 .654 -.178 .325 Growth 1.00 Part Three Understanding What The Correlation Coefficient Means Understanding The Corr. Coefficient  Understanding What The Correlation Coefficient Means As we saw, interpreting a correlation coefficient has a degree of flexibility. We know that the closer to –1 or +1, the stronger the correlation. We know that the – or + sign tells us the direction of the correlation. Now what? The easiest way to interpret the correlation is by eyeballing it. Your Salkind book gives us this table, which we can use in our “eyeballing” subjective judgments … Interpreting A Correlation Coefficient Size of the Correlation .8 to 1.0 (+ or –) Coefficient General Interpretation Very Strong Relationship .6 to .8 (+ or –) .4 to .6 (+ or –) .2 to .4 (+ or –) .0 to .2 (+ or –) Strong Relationship Moderate Relationship Weak Relationship Weak or No Relationship Pop-Quiz 3: Quiz Yourself  What is stronger, a +.52 correlation or a – .58 correlation? A). +.52 B). –.58 C). Neither is stronger than the other D). Both are equally strong Answer 3: B  What is stronger, a +.52 correlation or a – .58 correlation? A). +.52 B). –.58 C). Neither is stronger than the other D). Both are equally strong Part Four A Determined Effort: Squaring The Correlation Coefficient More Ways To Interpret A Correlation Eyeballing the correlation coefficient is nice and easy, but to get a more precise measurement, psychologists look to the coefficient of determination – Here, we want to look at the percentage of variance in one variable that is accounted for by the variance in another – In other words, what percentage of the two variables overlap? – The more two variables share in common, the more related they are GPA Example For example, think about your current college GPA. What factors do you think contribute to your GPA? – Your IQ – Your study skills – The complexity of your classes (easy A’s vs tough B’s) – The quality of your instructor (this better be a huge factor!) – Your wakefulness (are you usually wide awake or fatigued) – Your drunkenness (hopefully low on class days!) How Much Does Each Contribute? – Each of those factors can contribute to your GPA, but even if combined, they are unlikely to account for all of your GPA. Study skills might contribute 34% Ease of class another 10% Instructor 12% – There are tons of additional factors that you might not even be aware of that influence your GPA – So can we find the percentage that each factor contributes to another variable? Sure – take the correlation coefficient and square it! Squaring The Correlation Coefficient Recall correlation between satisfaction and happiness, which was .8108 (our Pearson score) – We simply square .8108, or .8108 X .8108 = .657. – We can make this into a percentage by multiplying by 100. 100 X .657 = 65.7%. – Here, you see that 65.7% of the relationship satisfaction variance can be explained by happiness. That is, both variables share 65.7% of the variance Coefficient of Determination What we are looking at here is the coefficient of determination, which is a fancy way of saying we are looking at the amount of variance in one variable that is accounted for by the variance in another variable (our 65.7% overlap from our last slide) – The more two variables share in common (the red portion), the more they are related. – In our example, the red portion represents 65.7% of overlap between the two variables. – This is just a way of translating our correlation coefficient into a percentage(which might be a little easier to interpret than the correlation). Coefficient of Non-Determination – So what about the parts of each variable that don’t overlap? – We call the non-shared variance (the red portion below) the coefficient of non-determination (or alienation). Calculating Non-Determination – You simply take 1 – coefficient of determination  Or 100-the percentage of overlap, if you have already turned our coefficient of determination into a percentage. – 1- .657 = .343, so .35 (or 35%) of our happiness is NOT accounted for by relationship satisfaction (or 35% of our relationship satisfaction is NOT accounted for by our happiness) How much is left over? – We can also think of the coefficient of non-determination as being what’s left over after we account for the relationship between two variables. – After we take away the parts that overlap between happiness and satisfaction, what’s left over in each of the variables? Example: Height and Income – Let’s say there is a correlation of rxy = .45 between height and income (it is actually true that height and income are correlated, but the number that I have here for this Pearson correlation coefficient is made up ☺ ). Overlap Between Height and Income – The coefficient of determination would be .452 = .202, or roughly 20%.  This means that roughly 20% of a person’s income is associated with how tall they are. Height and income overlap by about 20%. 20% – The coefficient of non-determination would be 100-20 = 80%  This means that roughly 80% of a person’s income is NOT associated with how tall they are.  We took 100% (all the variance) – overlap (20%) and we are left with 80%. 80% 80% Pop-Quiz 4: Quiz Yourself  If the correlation between variables is .70, what percent of the variance is shared (i.e. the coefficient of determination)?  A). 70%  B). 51%  C). 49%  D). 30% Answer 4: C  If the correlation between variables is .70, what percent of the variance is shared (i.e. the coefficient of determination)?  A). 70%  B). 51%  C). 49% (or simply .70 X .70 = .49, or 49%)  D). 30% Pop-Quiz 5: Quiz Yourself  Here’s an easy one! If the correlation between variables is .70, what percent of the variance is NOT shared variance (the coefficient of non-determination / alienation)?  A). 70%  B). 51%  C). 49%  D). 30% Answer 5: B  Here’s an easy one! If the correlation between variables is .70, what percent of the variance is NOT shared variance (the coefficient of non-determination / alienation)?  A). 70%  B). 51% (this is 100% – our 49% coefficient of determination)  C). 49%  D). 30% Pop-Quiz 6: Quiz Yourself  Okay, here’s a tougher one. If the coefficient of determination between two variables is .81, what is the Pearson correlation coefficient?  A). .19  B). .34  C). .66  D). .90 Answer 6: D  Okay, here’s a tougher one. If the coefficient of determination between two variables is .81, what is the Pearson correlation coefficient?  A). .19  B). .34  C). .66  D). .90 Going Back To Ice Cream and Crime  As More Ice Cream Is Eaten … The Crime Rate Goes Up Okay, we are back to the question we asked at the start of this chapter: – As ice cream consumption increases, so does the crime rate. So does one cause changes in the other? No, of course not, but there is a relationship. What else might underlie that correlation? Can you think of anything? Does Ice Cream Cause Crime? When are people more likely to eat ice cream? When it is hot outside! And when is crime more likely to occur? When it is hot outside! Temperature might be a factor that impacts BOTH ice cream consumption AND robberies. In other words, there is merely an “associated” relationship between ice cream and crime rather than a causal relationship Correlation Does Not Equal Causation Remember how we found that teenager’s happiness and relationship satisfaction were correlated? Does that mean that happiness causes relationship satisfaction? No! – Participants’ personalities (outgoing, bubbly people) may make them satisfied and happier. – This is a “third variable” problem (something else, like the teen’s personality, might be responsible) CORRELATION DOES NOT EQUAL CAUSATION Part Five Other Cool Correlations Other Cool Correlations  Other Cool Correlations There are different ways of looking at different variables, though they go beyond what we will look at in this chapter – Phi coefficient (for categorical data) – Spearman rank coefficient (for ranked data) – Pearson correlation coefficient (for scaled data) For our purposes here, just recognize that no matter the test you use, you are looking at the associative relationship among variables, not a causal relationship. We’ll get to such causal relationships later this semester! 9Post a comment in which you explain: : What is another possible explanation for this correlation that you calculated in your Individual HW Assignment 7. Since this is just a correlation, we can’t be sure if age actually causes changes in the percentage of 3 pointers players make, so what is one possible other explanation that you can come up with for the correlation that you calculated (hint: think about possible “third variable” explanations for this correlation–what other variables or factors might be causing changes in both x and y that could explain this correlation?)
Purchase answer to see full attachment

Explanation & Answer:

9 Questions