Here, we’ll look at some pilot data from an experiment! In this experiment, the researchers manipulated the type of music that participants were exposed to – either a familiar condition where songs were by artists that participants had previously indicated they knew well, an unfamiliar condition where songs were by artists that participants had reported not knowing, and a control condition where the clips were not songs at all but instead weather and traffic clips. For today’s purposes, we’ll just focus on the familiar and unfamiliar conditions.
In particular, we’ll use multilevel models here because the data are nested! This is a within-participants study where each participant completed multiple listening trials in each condition. Because of this, a multilevel model will help us take into account that each person might respond differently to the manipulation (while still estimating the group-wide response)!
Here we’ll ask the question: did clips in the familiar condition evoke more positive emotions than clips in the unfamiliar condition?
Participants rated how each clip made them feel immediately after listening on a likert-type scale from 1 (extremely negative) to 7 (extremely positive), in the musicValence
column
library(tidyverse)
library(rstanarm)
library(tidybayes)
df = read_csv('https://raw.githubusercontent.com/pab2163/amfm_public/master/pilot_data/pilot_responses.csv') %>%
dplyr::filter(condition != 'C', is.na(Skipped)) %>%
dplyr::select(Year, Title, Artist, Genre, condition, musicValence, familiarityRating, participantID) %>%
dplyr::mutate(condition = dplyr::recode(condition, 'F' = 'Familiar', 'U' = 'Unfamiliar'))
head(df, 3)
musicValence
)Make a multilevel model using rstanarm::stan_glm()
with random slopes and intercepts for each participant
condition
on musicValence
?spread_draws()
to get the full posterior distribution for the conditionUnfamilar
term, and plot the posterior distribution for this effectHint: stat_halfeye()
is a good way to go here
musicValence
for each conditionFirst plot the predictions for musicValence
in each condition for the group average, but then also overlay predictions for each participant
Bonus: connect each participant’s mean (or median) prediction for each condition to make ‘slopes’. Which participant has the steepest slope?
Make 1 panel for each participant with condition
on the x-axis and musicValence
on the y-axis. Plot both the model predictions and the raw data (or participant-level means and standard errors for musicValence
).
Is the model making good predictions for each participant? Are there any participants where the model is predicting particularly well or poorly?
pp_check()
Does the distribution of the model’s predictions for musicValence
seem to match the actual distribution of musicValence
?