1. Goals For Today

Convince you that Bayesian modeling is helpful

Convince you that, whatever you have heard, Bayesian modeling isn’t any harder than Frequentist modeling

You’ll feel comfortable and confident doing Bayesian modeling with regression

We’ll use exactly the same data as last week!

2. Some general tips

not sure what a function does? look at the documentation by running ?function_name in the console (or using google!)
the more comments in your code, the better! (don’t assume you’ll remember what you did several weeks from now)
mistakes and getting stuck is normal and expected! most of the coding process is getting unstuck when you’re confused, so don’t get discouraged if you hit a problem

3. Setting up your R environment

Note: sometimes these packages can take a little time & effort to install. If you are having trouble, please reach out to your intstructors or mentor right away for help getting them set up. Installation is often the hardest part for using R packages!

# load some packages
library(tidyverse) # contains a bunch of packages for working with data, plotting, etc.
library(rstanarm)
library(tidybayes)
library(bayesplot)
library(car)


# if you need to install these packages, run the following lines of code:
# install.packages('tidyverse')
# install.packages('car')
# install.packages('rstanarm')
# install.packages('tidybayes')
# install.packages('bayesplot')

Prepping your data for analysis

Building on the class from last week, we’ll continue using Prestige data, which is data from 1971 on different occupations in Canada. See more info here

But please reach out to your instructors — or check out materials associated with the basic track coding workshops — if you need a refresher! We’re happy to help.

# first, let's load in some data
# we're going to use a potentially-familiar dataset from within the `car` package called Prestige
# this contains information about various types of jobs (e.g., average income, education, etc.)

df <- Prestige

# for simplicity, we just want to look at blue collar and white collar jobs
# we'll also get rid of 'census' since we're not using it
df <- df %>%
  dplyr::filter(type != 'prof') %>%
  dplyr::select(-census)

# the head() function shows the first few rows of the dataframe
head(df)

##                     education income women prestige type
## nursing.aides            9.45   3485 76.14     34.9   bc
## medical.technicians     12.79   5180 76.04     67.5   wc
## radio.tv.announcers     12.71   7562 11.15     57.6   wc
## secretaries             11.59   4036 97.51     46.0   wc
## typists                 11.49   3148 95.97     41.9   wc
## bookkeepers             11.32   4348 68.24     49.4   wc

Notice, we have all our variables of interest in separate columns here! In order to use the lm() (and related) functions, it’s important that all of your variables of interest exist in separate columns. In our case, the Prestige dataset is already set up like this, so there’s no need to make any adjustments. But it’s not uncommon for datasets to require some restructuring in advance.

Variable coding & centering

Coding of Xs (variables), including continuous and categorical Xs, matters for interpreting our results, so we probably want to check the coding and centering of our variables in advance. This is especially important for when we have interaction effects in our model. But even in models without interaction effects, how we code variables will affect our interpretation of the intercept.

Continuous variables

Typically, continuous Xs will be mean-centered (substract the sample mean from each observation)

# scale education, women, and prestige
df = df %>%
  dplyr::mutate(education = as.vector(scale(education, center = TRUE, scale = TRUE)),
         women = as.vector(scale(education, center = TRUE, scale = TRUE)),
         prestige = as.vector(scale(education, center = TRUE, scale = TRUE)))


head(df)

##                     education income     women  prestige type
## nursing.aides       0.1057896   3485 0.1057896 0.1057896   bc
## medical.technicians 2.1052494   5180 2.1052494 2.1052494   wc
## radio.tv.announcers 2.0573582   7562 2.0573582 2.0573582   wc
## secretaries         1.3868806   4036 1.3868806 1.3868806   wc
## typists             1.3270166   3148 1.3270166 1.3270166   wc
## bookkeepers         1.2252477   4348 1.2252477 1.2252477   wc

Categorical variables

There are ways of turning categorical variables into numbers (which you need to do in order to run a regression model), and many statistical packages will do this for you “under the hood.” However, unless you recode your variables yourself, you might not know what different R packages/functions are actually doing, which may make it tricky to accurately interpret your results.

For categorical variables, it’s particularly important that zero is meaningful — as this will allow us to interpret the intercept of our model. (since the intercept = the value of our outcome variables when all predictors/moderators are set to 0) For simplicity, we will work with examples of categorical Xs involving two levels only (e.g., blue collar and white collar).

Effect coding

In effect coding, all of your levels/categories of X should sum up to 0.

In a case where your categorical variable has two levels, a common choice is to pick the values -0.5 and 0.5, or -1 and 1. Neither of these options will change the overall results (i.e., the significance) of your model, but they will impact your interpretation of the slope.

# let's also effect code the job `type` variable
# blue collar (bc) = -0.5, and white collar (wc) = 0.5

df$type.e <- dplyr::recode(df$type, "bc" = -0.5, "wc" = 0.5)

# note: the 'dplyr::' tells R which package to look in for the recode function
# while this isn't always necessary to do before calling a function, it's good practice
# it IS necessary to do, however, when multiple packages you've loaded have a function with the same name

head(df)

##                     education income     women  prestige type type.e
## nursing.aides       0.1057896   3485 0.1057896 0.1057896   bc   -0.5
## medical.technicians 2.1052494   5180 2.1052494 2.1052494   wc    0.5
## radio.tv.announcers 2.0573582   7562 2.0573582 2.0573582   wc    0.5
## secretaries         1.3868806   4036 1.3868806 1.3868806   wc    0.5
## typists             1.3270166   3148 1.3270166 1.3270166   wc    0.5
## bookkeepers         1.2252477   4348 1.2252477 1.2252477   wc    0.5

4. Running a Bayesian regression is basically the same syntax as using `lm()`

lm() code is equation based, so is the command stan_glm(), which uses Bayesian inference ‘under the hood’

Follows the basic equation y ~ x, where y is the oucome variable and x is the predictor variable

5. Regression With One Continuous Variable

For example, let’s say we want to predict income as a function of prestige:

The model syntax is almost exactly the same to run the Bayesian model using rstanarm::stan_glm(), everything inside the model call is the saame

Note: the Bayesian model will produce a lot more output as it is sampling – this is a good way to check how far along it is towards being finished runninh. By default, the model will run 4 ‘Chains’ with 2000 Iterations each. So, if your model says Chain 2: Iteraation 1800 / 2000 for example, you can know you’re a little less than halfway through.

# frequentist model
m1_freq <- lm(income ~ prestige, data = df)

# bayesian model
m1_bayes = rstanarm::stan_glm(income ~ prestige, data = df)

Interpreting `rstanarm::stan_glm()` model outputs

What is consistent across both models?

-The intercept (i.e. (Intercept)) Estimate describes the estimated value of the outcome variable when all predictors are set to 0. In this case, this means the estimated income when prestige is set to 0. Since prestige has been mean-centered, this would repressent the estimated income for a job of mean prestige.

The prestige Estimate term, or the ‘beta estimate’ for prestige, represents the estimated average difference in the outcome variable (income) associated with a 1-unit increase in prestige. So, in this case, if we increase prestige by 1 (1 standard deviation, since we’ve scaled it), income is expected to increase on average by this amount.

summary(m1_freq)

## 
## Call:
## lm(formula = income ~ prestige, data = df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3449.9 -1449.9  -215.4  1518.2  3858.4 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   5263.7      238.3   22.09   <2e-16 ***
## prestige       391.4      240.1    1.63    0.108    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1951 on 65 degrees of freedom
## Multiple R-squared:  0.03926,    Adjusted R-squared:  0.02448 
## F-statistic: 2.656 on 1 and 65 DF,  p-value: 0.108

If we inspect the output from the Bayesian model, we can see that the first column of output for the Estimates, mean, gives almost exactly the same values. That is because the Estimate from the first model is roughly equivalent to the mean posterior estimate in the Bayesian model in many cases – both are try to provide a single ‘best guess’ for the value of each parameter.

Now, with the Bayesian model, you might notice a few things:

There are no t-values or p-values! Depending on your point of view, this could either be a great thing or an awful thing. Either way though, this will mean that we have to think about our model in other ways.

Instead of t-values and p-values, our output by default from the summary command is giving us an 80% posterior interval – that is, the model’s best guess with 80% confidence about the true value of each parameter. We can see this 80% interval has a miniumu value labeled 10% and a maximum value labeled 90%. So, the middle 80% of the posterior values for each parameter lie in this interval.

summary(m1_bayes)

## 
## Model Info:
##  function:     stan_glm
##  family:       gaussian [identity]
##  formula:      income ~ prestige
##  algorithm:    sampling
##  sample:       4000 (posterior sample size)
##  priors:       see help('prior_summary')
##  observations: 67
##  predictors:   2
## 
## Estimates:
##               mean   sd     10%    50%    90% 
## (Intercept) 5266.2  245.7 4961.4 5262.2 5572.9
## prestige     390.1  238.8   83.8  391.3  697.1
## sigma       1968.6  176.8 1753.5 1953.0 2203.4
## 
## Fit Diagnostics:
##            mean   sd     10%    50%    90% 
## mean_PPD 5267.1  343.1 4826.9 5266.2 5702.5
## 
## The mean_ppd is the sample average posterior predictive distribution of the outcome variable (for details see help('summary.stanreg')).
## 
## MCMC diagnostics
##               mcse Rhat n_eff
## (Intercept)   3.9  1.0  4034 
## prestige      3.8  1.0  4035 
## sigma         2.9  1.0  3847 
## mean_PPD      5.4  1.0  4085 
## log-posterior 0.0  1.0  1589 
## 
## For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).

What if we want a 95% posterior interval though? We can set exactly what interval we want in our summary with the probs command:

# we use .025 and .975 because the MIDDLE 95% falls between these two values
summary(m1_bayes, probs = c(.025, .975))

## 
## Model Info:
##  function:     stan_glm
##  family:       gaussian [identity]
##  formula:      income ~ prestige
##  algorithm:    sampling
##  sample:       4000 (posterior sample size)
##  priors:       see help('prior_summary')
##  observations: 67
##  predictors:   2
## 
## Estimates:
##               mean   sd     2.5%   97.5%
## (Intercept) 5266.2  245.7 4782.4 5761.4 
## prestige     390.1  238.8  -80.6  853.5 
## sigma       1968.6  176.8 1660.3 2339.8 
## 
## Fit Diagnostics:
##            mean   sd     2.5%   97.5%
## mean_PPD 5267.1  343.1 4613.6 5952.1 
## 
## The mean_ppd is the sample average posterior predictive distribution of the outcome variable (for details see help('summary.stanreg')).
## 
## MCMC diagnostics
##               mcse Rhat n_eff
## (Intercept)   3.9  1.0  4034 
## prestige      3.8  1.0  4035 
## sigma         2.9  1.0  3847 
## mean_PPD      5.4  1.0  4085 
## log-posterior 0.0  1.0  1589 
## 
## For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).

6. Regression With One Categorical Variable

Effect Coded X (`type`)

What does the mean estimate for (Intercept) tell us?
What does the meaan estimate for type.e tell us?
What does the 90% posterior interval for (Intercept) tell us?
What does the 90% posterior interval for type.e tell us?

m2_freq <- lm(income ~ type.e, data = df)
m2_bayes <- rstanarm::stan_glm(income ~ type.e, data = df)

As with the previous model, the summary outputs are very similar, just we get a posterior interval from the Bayesian model

summary(m2_freq)

## 
## Call:
## lm(formula = income ~ type.e, data = df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3718.1 -1610.6  -240.1  1398.9  3727.7 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   5213.2      255.3   20.42   <2e-16 ***
## type.e        -321.8      510.5   -0.63    0.531    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1984 on 65 degrees of freedom
## Multiple R-squared:  0.006076,   Adjusted R-squared:  -0.009215 
## F-statistic: 0.3974 on 1 and 65 DF,  p-value: 0.5307

summary(m2_bayes)

## 
## Model Info:
##  function:     stan_glm
##  family:       gaussian [identity]
##  formula:      income ~ type.e
##  algorithm:    sampling
##  sample:       4000 (posterior sample size)
##  priors:       see help('prior_summary')
##  observations: 67
##  predictors:   2
## 
## Estimates:
##               mean   sd     10%    50%    90% 
## (Intercept) 5205.4  254.6 4882.7 5205.2 5531.3
## type.e      -319.8  528.7 -992.9 -329.2  358.8
## sigma       2007.8  179.1 1783.9 1994.3 2239.1
## 
## Fit Diagnostics:
##            mean   sd     10%    50%    90% 
## mean_PPD 5256.9  346.7 4814.0 5259.3 5707.2
## 
## The mean_ppd is the sample average posterior predictive distribution of the outcome variable (for details see help('summary.stanreg')).
## 
## MCMC diagnostics
##               mcse Rhat n_eff
## (Intercept)   4.2  1.0  3640 
## type.e        8.4  1.0  3982 
## sigma         2.9  1.0  3819 
## mean_PPD      5.6  1.0  3835 
## log-posterior 0.0  1.0  1755 
## 
## For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).

7. Plotting fitted model predictions

Often, it is hard to understand what our model is telling us until we visualize our model’s predictions, especially with the raw data

Bayesian models give us special power and flexibility for visualizing our model predictions. For this, we’ll use the tidybayes package!

Predictions with one continuous variaable

With this step, we now ask our models to predict what the expected mean value of income should be for each level of prestige, as well as an uncertainty interval. This is super helpful for visualizing what our models are telling us.

# create a dataframe with a range of prestige values to make predictions for  -- in this casse going from the minimum to the maximum value with 20 total values
continuous_predict_df = data.frame(prestige = seq(min(df$prestige), max(df$prestige), length.out = 20))

# now, ask the models to PREDICT the estimated income value for each level of prestige, with a confidence interval
m1_freq_predictions = data.frame(predict(m1_freq, newdata = continuous_predict_df, interval = 'confidence')) %>%
  cbind(continuous_predict_df, .)

# this looks similar, but a *little* different using the tidybayes package 
m1_bayes_predictions = tidybayes::add_fitted_draws(m1_bayes, newdata = continuous_predict_df)

For the frequentist model, as we saw last week, we can get an estimate and an upr and lwr bound of a 95% confidence interval for each level of prestige

head(m1_freq_predictions)

##     prestige      fit      lwr      upr
## 1 -1.7320372 4585.775 3628.431 5543.118
## 2 -1.5300747 4664.819 3790.184 5539.453
## 3 -1.3281123 4743.862 3948.736 5538.989
## 4 -1.1261498 4822.906 4103.025 5542.787
## 5 -0.9241874 4901.950 4251.571 5552.328
## 6 -0.7222249 4980.994 4392.335 5569.652

What we get for the Bayesian model is a little more detailed. We actually have 4000 different values for each level of prestige representing the 4000 posterior ‘draws’ (they are differentiated in the .draw column) for estimating mean income (the outcome variable, which is stored in the .value column) at each level of prestige. So, when we plot these, we’ll find ways to summarize the posterior predictions in different ways

While this output is slightly more complicated, it allows us to more easly view the posterior distribution in lots of different ways

head(m1_bayes_predictions)

## # A tibble: 6 x 6
## # Groups:   prestige, .row [1]
##   prestige  .row .chain .iteration .draw .value
##      <dbl> <int>  <int>      <int> <int>  <dbl>
## 1    -1.73     1     NA         NA     1  4534.
## 2    -1.73     1     NA         NA     2  4324.
## 3    -1.73     1     NA         NA     3  5315.
## 4    -1.73     1     NA         NA     4  4396.
## 5    -1.73     1     NA         NA     5  4934.
## 6    -1.73     1     NA         NA     6  4682.

By using the stat_lineribbon function from tidybayes, we can generate a very similar graph to the type we made last week

ggplot(m1_bayes_predictions, aes(x = prestige, y = .value)) +
  tidybayes::stat_lineribbon(.width = .95)

ggplot(m1_freq_predictions, aes(x = prestige, y = fit)) +
  geom_line() +
  geom_ribbon(aes(ymin = lwr, ymax = upr), alpha = .5)

Or with a little more style - we can plot both the 80% and 95% posterior intervals! Note: a lot of researchers stick to 95% intervals because these have somewhat arbitrarly been traditional, but if you want to break out of the mold of the 95% this is a great way to do it!

ggplot(m1_bayes_predictions, aes(x = prestige, y = .value)) +
  tidybayes::stat_lineribbon(.width = c(.80, .95)) +
  theme_bw() + 
  labs(y = 'Model-estimated income', x = 'Prestige (Mean-Centered)') +
  scale_fill_brewer(palette = "Blues")

Or with the raw data:

ggplot(m1_bayes_predictions, aes(x = prestige, y = .value)) +
  tidybayes::stat_lineribbon(.width = c(.80, .95)) +
  theme_bw() + 
  labs(y = 'Model-estimated income', x = 'Prestige (Mean-Centered)') +
  scale_fill_brewer(palette = "Blues") +
  geom_point(data = df, aes(x = prestige, y = income))

Predictions with one categorical variable

Now let’s plot model predictions for the model with one categorical variable – income as a function of type.e – whether jobs were white-collar or blue-collar

# these are the only values we need since there were just two categories
categorical_predict_df = data.frame(type.e = c(-.5, .5))


m2_freq_predictions = data.frame(predict(m2_freq, newdata = categorical_predict_df, interval = 'confidence')) %>%
  cbind(categorical_predict_df, .)

# this looks similar, but a *little* different using the tidybayes package 
m2_bayes_predictions = tidybayes::add_fitted_draws(m2_bayes, newdata = categorical_predict_df)

# recode back to informative labels
m2_freq_predictions$type =  dplyr::recode(m2_freq_predictions$type.e, '-0.5' ='Blue Collar', '.5'= 'White Collar')
m2_bayes_predictions$type =  dplyr::recode(m2_bayes_predictions$type.e, '-0.5' ='Blue Collar', '.5'= 'White Collar')

Now that we have the predictions from each model, we can plot them:

ggplot(m2_freq_predictions, aes(x = type, y = fit)) +
  geom_point() +
  geom_errorbar(aes(ymin = lwr, ymax = upr), width = 0)

Again, we have more options with plotting the Bayesian model fits, namely:

We can plot the entire posterior distributions here using tidybayes::stat_halfeye(), not just the 95% intervals

In this case, these distributions are fairly normal looking and don’t have much skew. BUT, if they were, we would know, and we could adjust our interpretations accordingly

We can again use the .width option to get intervals of multiple widths – below we’ll try 80% and 95%

ggplot(m2_bayes_predictions, aes(y = .value, x = type)) +
  tidybayes::stat_halfeye(.width = c(.80, .95)) +
  labs(y = 'Model-estimated income')

8. Bayesian regression with an interaction!

Just like last time, so far we have explored the relationship between income and prestige and income and type separately. In addition to being interested in main effects, we sometimes want to look at how two variables may interact with one another in their associations with an outcome.

For example we might hypothesize that the asssociation between of prestige and income is stronger for white collar jobs but weaker for blue collar jobs. This is what we call a moderation, or an interaction (although some reseaarchers prefer to stick with the term ‘interaction’ because it does not imply a specific causal mechanism, whereas ‘moderation’ does)

For this next regression, we want to test both the main effects of type and prestige on income (y ~ X1 + X2) and their interaction (y ~ X1 * X2)

Note that running an interaction automatically gives us the main effects as well

What do the intercept, prestige.c, type.e, and prestige.c:type.e now tell us? We will go through each term one by one.

Run interaction model syntax

# frequentist
m3_freq <- lm(income ~ prestige * type.e, data = df)

# bayesiaan
m3_bayes <- rstanarm::stan_glm(income ~ prestige * type.e, data = df)

Inspect model summaries

summary(m3_freq)

## 
## Call:
## lm(formula = income ~ prestige * type.e, data = df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4364.3 -1151.1   -32.4  1120.6  3695.6 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       5349.5      451.0  11.861   <2e-16 ***
## prestige           971.2      405.3   2.396   0.0195 *  
## type.e           -1632.2      902.1  -1.809   0.0752 .  
## prestige:type.e   -950.9      810.5  -1.173   0.2451    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1828 on 63 degrees of freedom
## Multiple R-squared:  0.1824, Adjusted R-squared:  0.1435 
## F-statistic: 4.686 on 3 and 63 DF,  p-value: 0.005119

summary(m3_bayes)

## 
## Model Info:
##  function:     stan_glm
##  family:       gaussian [identity]
##  formula:      income ~ prestige * type.e
##  algorithm:    sampling
##  sample:       4000 (posterior sample size)
##  priors:       see help('prior_summary')
##  observations: 67
##  predictors:   4
## 
## Estimates:
##                   mean    sd      10%     50%     90%  
## (Intercept)      5385.4   453.6  4812.0  5381.4  5970.5
## prestige          940.8   400.4   448.1   942.6  1436.0
## type.e          -1561.6   898.2 -2689.6 -1574.0  -417.8
## prestige:type.e  -997.9   830.1 -2023.4 -1001.1    53.6
## sigma            1848.1   163.5  1649.2  1835.7  2063.0
## 
## Fit Diagnostics:
##            mean   sd     10%    50%    90% 
## mean_PPD 5277.9  317.0 4876.3 5280.3 5671.6
## 
## The mean_ppd is the sample average posterior predictive distribution of the outcome variable (for details see help('summary.stanreg')).
## 
## MCMC diagnostics
##                 mcse Rhat n_eff
## (Intercept)      9.7  1.0 2165 
## prestige         9.0  1.0 1991 
## type.e          20.8  1.0 1869 
## prestige:type.e 17.4  1.0 2288 
## sigma            3.1  1.0 2849 
## mean_PPD         5.4  1.0 3451 
## log-posterior    0.0  1.0 1615 
## 
## For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).

Questions for model interpretation:

What does the mean estimate for (Intercept) tell us?
What does the mean estimate for prestige tell us?
What does the mean estimate for type.e tell us?
What does the mean estimate for prestige:type.e tell us?
What does the 90% posterior interval for (Intercept) tell us?
What does the 90% posterior interval for prestige tell us?
What does the 90% posterior interval for type.e tell us?
What does the 90% posterior intervalfor prestige:type.e tell us?

Setting up a prediction grid for the interaction model

Here we can use the expand.grid command to set up a grid of all combinations of prestige and type.e

interaction_prediction_df = expand.grid(prestige = seq(min(df$prestige), max(df$prestige), length.out = 20),
                                        type.e = c(-.5, .5))

# frequentist predictions
m3_freq_predictions = data.frame(predict(m3_freq, newdata = interaction_prediction_df, interval = 'confidence')) %>%
  cbind(interaction_prediction_df, .)

# bayesian predictions
m3_bayes_predictions = tidybayes::add_fitted_draws(m3_bayes, newdata = interaction_prediction_df)

# recode back to informative labels
m3_freq_predictions$type =  dplyr::recode(m3_freq_predictions$type.e, '-0.5' ='Blue Collar', '.5'= 'White Collar')
m3_bayes_predictions$type =  dplyr::recode(m3_bayes_predictions$type.e, '-0.5' ='Blue Collar', '.5'= 'White Collar')

Plot the interactions

ggplot(m3_freq_predictions, aes(x = prestige, y = fit)) +
  geom_line(aes(color = type)) +
  geom_ribbon(aes(ymin  = lwr, ymax = upr, fill = type), alpha = .5)

ggplot(m3_bayes_predictions, aes(x = prestige, y = .value)) +
  tidybayes::stat_lineribbon(.width = .95, aes(fill = type))

Or with a little more glamour

ggplot(m3_bayes_predictions, aes(x = prestige, y = .value)) +
  tidybayes::stat_lineribbon(.width = c(.95, .80), aes(fill = type), alpha = .5) +
  theme_bw() +
  labs(x = 'Prestige (Mean-Centered)', y = 'Estimated Income')

What are your questions?