## MATH 225N Week 8 Discussion: Correlation and Regression

MATH 225N Week 8 Discussion: Correlation and Regression

For grading purposes, this particular discussion posting area runs from Sunday Feb 21 through **SATURDAY** Feb 27, inclusively.

We explore so-called Two Variable Statistics this Week. This includes linear correlation, simple linear regression, the coefficient of determination, “correlation versus causation,” scatter plots, and more !

Please don’t forget to use an “**outside**” resource as part of the content and documentation for your first Post – the Post which is due on or before Wednesday of the Week – the Post where you make the most major contribution to the Weekly discussion posting area and attempt to address the discussion prompts / cues for the Week. It could possibly include a web site that you discovered on the internet at large, so long as the web site is relevant and substantial and does not violate the Chamberlain University policy for prohibited web sites, and so forth. It could possibly include references / resources that you discover through making use of the online Chamberlain University Library ( please click Resources along the left and then click Library to discover the link to the Chamberlain University online Library ) .

Please check out the link below for some information about simple linear regression, the coefficient of determination, and the concept of standard error.

Link (Links to an external site.)Links to an external site.

This is one kind of an example of using an “outside” source / resource to add to what is revealed in our Weekly Lesson in Modules and in our Weekly text book reading.

Please don’t forget to look over the Graded Discussion Posting Rubric each Week to be certain that you are meeting all of the Frequency requirements as well as all of the Quality requirements for graded discussion posting each Week.

If you have any questions about anything, please do not hesitate to post in the Q & A Forum discussion posting area or to send me a direct e-mail message to CSmith10@chamberlain.edu

Thanks Friends and Good Luck ! Work hard and learn a lot !!

Sincerely, Mr. Smith Chamberlain University Math, Statistics, and Quantitative Research

I made up some data about predicting Total Cholesterol from BMI to facilitate providing and talking about an example of simple linear regression and linear correlation and coefficient of determination to get our Week 8 graded Posting area started.

Please see the attached Excel spread sheet with the data and results from the simple linear regression procedure.

PLEASE do not attach any real world significance to the regression equation or the values for r, r2, etc.

Remember please, I MADE THE DATA UP !! ( LOL )

I used BMI as the independent variable and Total Cholesterol as the dependent variable.

**PLEASE** note that in my example here in this Post I am treating BMI as the *independent* variable but if you **PLEASE** carefully read your Posting assignment for this Week 8 near the very top of this web page, the way that the assignment questions / prompts are written strongly suggests that you all treat BMI as the *dependent* variable in your Posts !! I won’t absolutely require you all to do that necessarily but to hopefully avoid a lot of confusion about this point in the Week 8 class member Posts as a whole, I definitely wanted to point out this difference here and make a big deal out of it.

The **prediction equation** was Total Cholesterol = 6.25BMI + 33

The meaning of 6.25BMI is 6.25 times BMI .

So, everyone, please find **predicted y-values** for the following BMI values using the **prediction equation**.

( OK LOL *not everyone* but hopefully at least 2 or 3 class members )

Use BMI values of 21 22 23 24 25 26 27 28 29 30 31

What you will be calculating there are predicted values for Total Cholesterol.

This is called performing so-called linear interpolation.

Notice that we did not plug a BMI value such as a BMI value of 15 or 37 because those values for BMI would be outside of the range of the original BMI values for the original ordered pairs.

So in other words, we are avoiding the practice of “extrapolation” here.

Many text books and authors caution against doing extrapolation.

You will see Folks saying that when using a regression equation to calculate a predicted value, stay between the lowest and highest of the values of the independent variable used to calculate the regression equation ( prediction equation ) to begin with…

The value for the linear correlation coefficient r was approximately 0.77

The value for the coefficient of determination r2 was approximately 0.59

That means that ( for these particular sample data ) approximately 59% of the variation in Total Cholesterol was explained by the variation in BMI.

Question for everyone:

( only 3 or 4 class members literally need to Post an answer for this – **NOT literally everyone** ! LOL )

So for these data ( as a percent ) what was the approximate percent of variation in Total Cholesterol that was NOT explained by the variation in BMI ??

The linear correlation coefficient was positive and was moderately strong.

The positive linear association is not too surprising. A link between BMI and Total Cholesterol ( them both increasing together or decreasing together ) seems fairly reasonable.

It is important in Week 8 to learn that “correlation does not imply causation.”

So based on these sample data alone we cannot say that a higher BMI “causes” a higher Total Cholesterol.

But we can note the positive linear association ( “link” or “connection” ) in these particular sample data.

Thanks Friends and Best Wishes !

Enjoy Friends and Best Wishes and please don’t forget that Week 8 ends on a SATURDAY and NOT on a Sunday !!

Thanks and take good care !!

Please see attached for the data and simple linear regression results and linear correlation results.

Regression Correlation_MATH 225 N BMI and Total Cholesterol Example Smith.xlsx