Comparison of Multiple Groups via ANOVA
1) Download theNutrion study data and read it into R-Studio. We will work with the entire data set for this assignment. Use the IFELSE( ) function to create 2 new categorical variables. The variable should be defined as:
Age_Cat =
1 if Age <=19
2 if 20<= Age<=29
3 if 30<=Age<=39
4 if 40<=Age<=49
5 if 50<=Age<=59
6 if 60<=Age<=69
7 if Age>=70
and,
Alcohol_Cat =
0 if Alcohol=0
1 if 0<Alcohol<=3
2 if 3<Alcohol<10
3 if Alcohol>=10
If you have trouble using the IFELSE( ) function in R, you could create these new categorical variables in EXCEL, and then just read them into R with the dataset. It works either way.
Report the counts for each value of these 2 new categorical variables.
2) Using the variable Quetelet as the dependent response variable (Y), specify the null and alternative hypotheses and conduct a oneway ANOVA F-test to check for mean differences on the levels of Age_Cat variable, and a separate ANOVA for the Alcohol_Cat variable. Interpret the two hypothesis tests. What do you conclude? If you have a statistically significant result at the alpha=0.05 level, then you must follow up the significant ANOVA with a post hoc analysis. At this point, use 95% Confidence Intervals for each group to determine if there are group mean differences and where they occur. Discuss your findings.
3) Now, using the Calories variable as the dependent response variable (Y), conduct similar ANOVA hypothesis tests and obtain confidence intervals for each group to determine if there are group mean differences relative to Age_Cat and Alcohol_Cat. You will need to clearly set up the null and alternative hypotheses, conduct the test with appropriate statistics, and interpret the individual group confidence intervals.
4) For the FAT, FIBER, and CHOLESTEROL variables, use a 95% confidence interval approach to compare groups, on average, for Age_Cat and Alcohol_Cat. Interpretthe confidence intervals. Use whatever outside information you can obtain to help interpret the results.
5) With the results from this additional analysis, how has the story description from Modeling Assigment #3 changed? You are welcome to bring in information from your prior knowledge and experience to embellish this story. Is the analysis sufficient so far for your story, or is something missing? What should be done next? Write up your synthesis description of what this data set seems to be saying (up to this point) and where we should go from here.