Date: November 24, 2025
Starting from scratch & building your results
Final Project Check in
First, we need to familiarize ourselves with the variables that are present in the data
The Dataset:
N = 400 children followed annually for 3 years.
3 Timepoints: Baseline (ages 7-9), T2 (8-10), and T3 (9-11)
Key Measures:
Psychopathology: Depression (Dep), Anxiety (Anx)
Sleep: Sleep Duration (SleepDur), Sleep Quality (SleepQual)
Stress: Peer Stress (PeerStress), Academic Stress (AcadStress)
Development: Puberty (Puberty on 0-8 scale), Age (Age)
Rows: 400
Columns: 27
$ ID <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 1…
$ Sex <int> 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1,…
$ SES <dbl> -0.20, -2.85, -0.71, -0.09, -0.17, -1.02, -1.82, 1.23, -…
$ T1_Age <dbl> 7.57, 8.77, 7.19, 9.67, 9.46, 9.68, 9.87, 9.05, 9.21, 9.…
$ T1_Puberty <dbl> 3.00, 3.79, 4.04, 4.86, 5.66, 4.51, 4.69, 4.59, 4.97, 4.…
$ T1_SleepDur <dbl> 9.85, 9.57, 8.61, 9.26, 8.63, 9.02, 10.05, 8.84, 10.48, …
$ T1_SleepQual <dbl> 6.57, 5.61, 4.22, 4.58, 4.81, 5.25, 4.43, 4.38, 4.46, 5.…
$ T1_PeerStress <dbl> 12.20, 9.85, 5.03, 8.02, 4.00, 7.68, 5.11, 11.67, 8.07, …
$ T1_AcadStress <dbl> 4.19, 8.18, 7.44, 6.87, 5.92, 6.43, 5.21, 4.93, 6.14, 2.…
$ T1_Dep <dbl> 10.43, 8.89, 5.49, 8.49, 10.98, 8.58, 12.09, 11.89, 12.1…
$ T1_Anx <dbl> 4.07, 4.65, 4.17, 3.73, 6.77, 5.12, 4.52, 5.90, 1.88, 8.…
$ T2_Age <dbl> 8.57, 9.77, 8.19, NA, 10.46, 10.68, 10.87, 10.05, 10.21,…
$ T2_Puberty <dbl> 4.17, 4.94, 4.94, NA, 6.66, 5.39, 5.92, 5.34, 6.17, 5.96…
$ T2_SleepDur <dbl> 8.54, 9.38, 8.11, NA, 8.77, 7.22, 9.32, 7.55, 9.94, 8.10…
$ T2_SleepQual <dbl> 6.03, 5.14, 2.64, NA, 4.64, 7.15, 5.03, 6.07, 5.45, 3.89…
$ T2_PeerStress <dbl> 10.82, 11.77, 10.59, NA, 8.62, 13.33, 8.31, 13.33, 11.22…
$ T2_AcadStress <dbl> 3.82, 5.71, 8.23, NA, 6.26, 4.10, 7.05, 6.32, 5.81, 6.22…
$ T2_Dep <dbl> 12.14, 9.35, 8.29, NA, 10.66, 10.04, 13.96, 11.72, 11.55…
$ T2_Anx <dbl> 3.84, 3.61, 1.61, NA, 10.24, 8.02, 3.30, 7.33, 5.52, 11.…
$ T3_Age <dbl> 9.57, 10.77, NA, NA, 11.46, 11.68, NA, 11.05, 11.21, NA,…
$ T3_Puberty <dbl> 5.05, 6.37, NA, NA, 7.42, 6.69, NA, 6.48, 7.49, NA, NA, …
$ T3_SleepDur <dbl> 7.52, 7.44, NA, NA, 8.15, 6.96, NA, 7.52, 6.66, NA, NA, …
$ T3_SleepQual <dbl> 5.60, 3.75, NA, NA, 4.88, 4.58, NA, 5.30, 2.74, NA, NA, …
$ T3_PeerStress <dbl> 8.62, 9.49, NA, NA, 10.00, 10.14, NA, 10.43, 7.29, NA, N…
$ T3_AcadStress <dbl> 4.17, 4.41, NA, NA, 6.15, 5.97, NA, 6.59, 6.20, NA, NA, …
$ T3_Dep <dbl> 13.53, 14.60, NA, NA, 9.52, 20.08, NA, 22.65, 17.87, NA,…
$ T3_Anx <dbl> 3.09, 4.94, NA, NA, 10.72, 12.14, NA, 4.03, 2.82, NA, NA…
These variables are fairly intuitive, so you don’t have to be an expert in this area to think about a research question. Take a moment and review the data dictionary (Week 14 page) and think about what questions seem interesting to you.
If you have any empirical questions about the data or this area, please ask Dr. Haraden (he is an “expert”) for clarification.
Record your research question on the handout that has been provided. Include the variables that you think you can use to answer your question.
Share your research question with your partner. Identify your variables of interest. Work together to identify a potential analytic model that you can use.
As the consultant, be sure to ask about the model and how it is going to be answering their question as well as seeing if there are alternative explanations for their “narrative”.
You will now need to identify the specific statistical model that you are planning on using to address your question
Start by drawing your model (remember the boxes and arrows) to get an idea of how you are thinking about it. Ask yourself “Which variable influences this one?”
Specify your variables from the dataset and align them with the model you have.
Write out some pseudo-code to help you out when we move to using R
You just made your data analytic plan!
Every paper should have a section that goes into detail about the steps they are going to go through. This helps to give readers a framework on what to expect.
Now that we have a clearly defined our research question, we can begin the process of working with the data!
Here are the steps that I tend to follow:
From our data analytic plan, we have a general idea on how to build our model 👷♀️
But that is all the way at step #4
We need to construct the foundation and provide the reader with the necessary information to be able to properly interpret your results
Data Analytic Plan = Recipe
Descriptives & “Table 1” = Your specific ingredients
Building the Models = Putting everything together and baking
Discussion = Icing and Finishing touches
Getting away from baking, let’s say you are building something (LEGO?), or remodeling.
Results Part 1 (Descriptives) is the “Inventory & Site Survey”
Note
We cannot build a stable house if we don’t admit that 10% of our wood is rotten. Descriptives are just us checking the pile of lumber to make sure it’s safe to build with.
Results Part 2 (Inference) is the “Construction & Stress Test”
Now let’s use R to finish the steps and put together a results section!