Lab 4: Inferential Stats Basics
Instructions
Here are the things that you will need for this lab:
When you are finished, click the Knit button to turn your work into an HTML document. You will submit both this .Rmd
file and the 🧶knitted .html
file.
Scenario and Data
A researcher at the university’s sleep center is interested in the factors that affect daytime sleepiness and attention in college students. They collected survey data from a sample of students.
The data highlighted above contains the following variables (and more):
ESS1 - ESS8
: (Continuous) The student’s responses to each item on the Epworth Sleepiness Scale, a measure of general daytime sleepiness. Higher scores indicate greater sleepiness. NOTE: You will need to calculate a total score which is a sum of all 8 items.ashs1 - ashs33
: (Continuous) The student’s responses to each item on the Adolescent Sleep Hygiene Scale. Higher scores indicate better sleep habits (e.g., consistent bedtime, quiet environment). The original data collection did not have an item 25 (so you won’t see one in your data).- NOTE: You will need to reverse score all items except #27. You will also need to calculate a total mean score with these items.
attention1r - attention5r
: (Categorical) An indication of whether they passed (1) or failed (0) the attention check item. A proxy for their attention during the survey. NOTE: Calculate a total attention score as a sum of all 5 items.age
: (Continuous) The student’s age in years.roommate
: (Categorical) Whether the student has a roommate or not (“Yes”, “No”).- 1 = “Yes” and 2 = “No”
Instructions
Create a new R Markdown file for your submission. Make sure it is well-organized and includes your code, output, and written answers.
Load the appropriate libraries and the
student_sleep_data.csv
dataset.Complete the tasks below.
Task 1: Compute scores for scales
Be sure to compute the appropriate total scores for the ESS, ASHS and Attention variables.
There are a couple attention items that are in the ASHS section which can get a little confusing when trying to reverse code and getting total scores.
Once completed, only include students who have “passed” the attention check (i.e., having a score of 4 or higher on the attention sum total).
hint: use dplyr and filter
Task 2: Roommates and Daytime Sleepiness (A Group Comparison)
The Research Question: “Do students who have a roommate report different levels of daytime sleepiness compared to students who do not have a roommate?”
Your Steps:
Identify & Model:
What is the predictor (IV) and what is the outcome (DV)?
Are they categorical or continuous?
Based on this, what is the appropriate model family? Write out the model in R formula syntax (
outcome ~ predictor
).
Describe & Visualize:
Calculate the mean and standard deviation of
ess_score
for both groups (those with and without a roommate).Create a boxplot to visualize the distribution of
ess_score
for each group. Make sure your plot is clearly labeled.
Analyze:
- Run the appropriate statistical test in R to determine if there is a significant difference between the groups.
Interpret & Conclude:
What is the p-value from your test?
Based on the test and your descriptive statistics, write a one-sentence conclusion that directly answers the research question.
Critical Thinking: This is an observational study. Can you conclude that having a roommate causes a change in sleepiness? Why or why not?
Bonus: Sleep Habits and Age (An Association)
This question is not required. But you will get some bonus points if you decide to do it!
The Research Question: “Is there an association between a student’s sleep habits and their age?”
Your Steps:
Identify & Model:
What is the predictor (IV) and what is the outcome (DV)?
Are they categorical or continuous?
Based on this, what is the appropriate model family? Write out the model in R formula syntax.
Visualize:
Create a scatterplot to visualize the relationship between
ashs_score
(sleep habits) andage
.Add a line of best fit to the plot. Make sure your plot is clearly labeled.
Analyze:
- Run the appropriate statistical test in R to determine if there is a significant association between the two variables.
Interpret & Conclude:
What is the correlation coefficient (
r
) and the p-value from your test?Based on these results, write a one-sentence conclusion that describes the nature and significance of the relationship.
End of Lab. Don’t forget to Knit! 🧶