Chapter 21 Example 3.8: Bayesian Analysis of Resin Lifetimes

Bayesian analysis allows us to make probability statements regarding parameters and models, but it also requires us to select prior distributions and do some tricky computations. Furthermore, when using vague, diffuse prior distributions (as default priors tend to be), the results of the Bayesian analysis of the simple models we are considering now is usually very similar to what we obtain from the usual frequentist approaches. So at this stage of the game, there is little to be gained from a Bayesian approach; that will change later.

We will use functions from package BayesFactor and package bcfcdae; cfcdae is the “companion” to the text A First Course in Design and Analysis of Experiments, and bcfcdae is the “Bayesian companion.” We also need rstan for some plots.
> library(BayesFactor)
Loading required package: coda

Attaching package: 'coda'
The following object is masked from 'package:rstan':

    traceplot
Loading required package: Matrix
************
Welcome to BayesFactor 0.9.12-4.4. If you have questions, please contact Richard Morey (richarddmorey@gmail.com).

Type BFManual() to open the manual.
************
> library(cfcdae)
> library(bcfcdae)
> library(rstan)
rstan suggests the following command:
> rstan_options(auto_write = TRUE)

This means that each Bayesian model type will only need to be compiled once; the complied version will be saved so that after that the saved version can be used without compiling.

For either we will need to load the data and set up some polynomial terms.
> data(ResinLifetimes)
> myRL <- ResinLifetimes
> myRL$temp.z2 <- myRL$temp.z^2
> myRL$temp.z3 <- myRL$temp.z^3
> myRL$temp.z4 <- myRL$temp.z^4

21.1 Using bcfcdae::bglmm()

21.1.1 Basic Treatment Effects Model

Let’s fit both the single mean model and the separate means model using Bayesian inference. The function that fits models in bcfcdae is bglmm(). It has one required argument (the model statement), and many optional arguments. We use quiet=TRUE to suppress the voluminous progress information that would otherwise be printed during the MCMC.

> bfit.single <- bglmm(logTime~1,data=myRL,quiet=TRUE,seed=10015)
> bfit.separate <- bglmm(logTime~temp,data=myRL,quiet=TRUE,seed=10016)
Before doing any inference, we do some informal diagnostics to see if we can detect any problems with the MCMC. By default, bglmm() runs four chains of length 2000 and discards the first 1000 from each chain. If we plot the outcomes of the four chains for each variable, the chains should pile on top of each other. If we see separation between the chains (by color), then we have a problem. Neither model fit shows any sign of separation between the chains for any variable:
> plot(bfit.single,"trace")

> plot(bfit.separate,"trace")

In addition to the manifest parameters \(\mu\), \(\alpha_i\), and \(\sigma_0\) (the standard deviation of the statistical errors), we also get traces for the prior standard deviations of the “fixed” terms (Intercept) and temp.

A second diagnostic is to look at the autocorrelation of the different chains. Autocorrelation is the correlation between values at some lag k. Lag 0 autocorrelation is always 1 (because you’re correlating something with itself if the lag is 0), but we want the autocorrelation to drop to 0 as quickly as possible. If it doesn’t drop to 0 quickly, that means that you are not getting a lot of information out of the chains. Sometimes a reformulation of the model can help (for example, using orthogonal polynomials instead of standard monomials).

For our two models, the autocorrelations drop to 0 very quickly.
> plot(bfit.single,"autocor")

> plot(bfit.separate,"autocor")
We should note that for the plotting functions you can also select the parameters to plot:
> plot(bfit.separate,"trace",pars=c("temp1","temp2"))
A final quick check comes from some of the summaries of the output. Our actual sample size is 4000 (by default, anyway), and we want the effective sample size to be as large as possible. We also want the R statistic to be close to 1. If R is much above 1, say above 1.1, then we know there is a problem; unfortunately, R being near 1 does not mean a lack of problem.
> summary(bfit.single)[,9:10]
                n_eff Rhat
(Intercept)      3080    1
sigma0           3380    1
sigma.Intercept  3340    1
> summary(bfit.separate)[,9:10]
                n_eff  Rhat
(Intercept)      7380 1.000
temp1            6030 1.000
temp2            7410 0.999
temp3            6720 0.999
temp4            6400 1.000
sigma0           4610 1.000
sigma.Intercept  4830 1.000
sigma.temp       3840 1.000

The effective sample sizes in the separate means model are too big to be believable (effective sample size should not be greater than actual sample size), but otherwise everything seems fine.

Now that we have decided that the Bayesian fits have worked well we can proceed to deciding which model is better. Our first approach is via leave-one-out information criterion:
> loo(bfit.single)
[1] 24.68105
> loo(bfit.separate)
Warning: Some Pareto k diagnostic values are slightly high. See help('pareto-k-diagnostic') for details.
[1] -60.12085
The separate means model has a much lower LOOIC value than the single mean model (as would be expected from our frequentist analysis), but there is an ominous warning. loo() uses something called Pareto tail smoothing in its computations, and the warning is saying that something might not be working as well as we would wish. Recompute the LOOIC, but ask for more information:
> loo(bfit.separate,verbose=TRUE)
Warning: Some Pareto k diagnostic values are slightly high. See help('pareto-k-diagnostic') for details.

Computed from 4000 by 37 log-likelihood matrix

         Estimate   SE
elpd_loo     30.1  5.5
p_loo         6.2  1.8
looic       -60.1 10.9
------
Monte Carlo SE of elpd_loo is 0.1.

Pareto k diagnostic values:
                         Count Pct.    Min. n_eff
(-Inf, 0.5]   (good)     36    97.3%   782       
 (0.5, 0.7]   (ok)        1     2.7%   352       
   (0.7, 1]   (bad)       0     0.0%   <NA>      
   (1, Inf)   (very bad)  0     0.0%   <NA>      

All Pareto k estimates are ok (k < 0.7).
See help('pareto-k-diagnostic') for details.

We would really like all of the diagnostic values to be in the “good” range, but one of them is in the “OK” range. That’s good enough for us, so the LOOIC should be trustworthy.

We can also compare models using the Bayes factor, but approach this with some caution as the Bayes factor is strongly influenced by the prior distribution. Our Bayesian fits are based on default priors, so we do not have a lot of faith in them. That said, the Bayes factor is absurdly high in favor of the separate means model.
> bayes_factor(bfit.separate,bfit.single)
Iteration: 1
Iteration: 2
Iteration: 3
Iteration: 4
Iteration: 1
Iteration: 2
Iteration: 3
Iteration: 4
Iteration: 5
Estimated Bayes factor in favor of bridge1 over bridge2: 658316555077067.75000
To illustrate this sensitivity, suppose that we instead had some prior belief about the standard deviations: residual standard deviation could be around .15, but might be off by a factor of three in either direction; standard deviation of the treatment effects could be around .5, but might be off by a factor of three in either direction; and standard deviation of the mean might be 2, but again could be off by a factor of three. Refit the model with this prior:
> bfit.separate.alt <- bglmm(logTime~temp,data=myRL,quiet=TRUE,                         sigma0.mean=.15,sigma0.shape=5,sigma.mean.int=2,
+                  sigma.shape.int=5,sigma.mean=.5,sigma.shape=5,
+                  seed=10017)
Now compute the Bayes factor between the fit with our somewhat informative prior and the fit with the default prior.
> bayes_factor(bfit.separate.alt,bfit.separate)
Iteration: 1
Iteration: 2
Iteration: 3
Iteration: 4
Iteration: 5
Iteration: 1
Iteration: 2
Iteration: 3
Iteration: 4
Iteration: 5
Estimated Bayes factor in favor of bridge1 over bridge2: 1.94951

The Bayes factor favors the informative prior.

At this stage we are comfortable with using our Bayesian fits, and we can look more closely at the results. The summary() function prints a matrix with a row for each parameter in the model and ten columns:

  • Column 1: the mean of the posterior distribution.
  • Column 2: the numerical standard error of that mean. This will go to zero as the sample size in the Markov chains increases and is usually of minor concern.
  • Column 3: the standard deviation of the posterior distribution. This reflects the true posterior variability in the parameter.
  • Columns 4-8: percentiles of the posterior distribution. Note that you can change which percentiles to use via the probs argument to summary(). These columns can be used to form posterior credible intervals for the various parameters.
  • Columns 9-10: the effective sample size and R diagnostic.
> options(width=100)
> summary(bfit.separate)
                   mean  se_mean     sd   2.5%     25%     50%     75%   97.5% n_eff  Rhat
(Intercept)      1.4400 0.000191 0.0164  1.410  1.4300  1.4400  1.4500  1.4700  7380 1.000
temp1            0.4900 0.000403 0.0313  0.426  0.4690  0.4900  0.5100  0.5510  6030 1.000
temp2            0.1890 0.000359 0.0309  0.128  0.1700  0.1900  0.2100  0.2480  7410 0.999
temp3           -0.0606 0.000378 0.0310 -0.121 -0.0815 -0.0604 -0.0403  0.0011  6720 0.999
temp4           -0.2420 0.000422 0.0338 -0.309 -0.2640 -0.2420 -0.2190 -0.1760  6400 1.000
sigma0           0.0979 0.000184 0.0125  0.077  0.0891  0.0965  0.1060  0.1260  4610 1.000
sigma.Intercept  1.7200 0.012200 0.8510  0.680  1.1200  1.5000  2.1000  3.9400  4830 1.000
sigma.temp       0.4390 0.002780 0.1730  0.223  0.3210  0.4020  0.5140  0.8680  3840 1.000

(The options() call makes R print longer rows and prevents wrapping.)

It is instructive to compare the Bayesian results with frequentist results. For this model and these data, the results are essentially the same; this is not always true.
> confint(lm(logTime~temp,data=ResinLifetimes))
                 2.5 %       97.5 %
(Intercept)  1.4056502  1.470230797
temp1        0.4321203  0.556998772
temp2        0.1283703  0.253248772
temp3       -0.1228797  0.001998772
temp4       -0.3092799 -0.178029622

Note: if you use the option show.redundant.effects=TRUE, then bglmm() will also provide results for coefficients that are set via zero sum constraints, in this case, the effect for the fifth level of temp.

There are some additional plots that you might find useful for viewing the results. We can get histograms of the posterior distributions:
> plot(bfit.separate,"histogram",pars=c("temp1","temp2","temp3","temp4"))
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
And we can display interval estimates:
> plot(bfit.separate,"interval",pars=c("temp1","temp2","temp3","temp4"))
ci_level: 0.8 (80% intervals)
outer_level: 0.95 (95% intervals)

The thick bars cover 80% of the distribution, and the thin bars cover 95% of the distribution.

21.1.2 Polynomial models

Temperature is a quantitative variable, so we can model the response as a function of temperature. We begin by fitting polynomial models of order 1 through 4 using orthogonal polynomials.
> bfit.linear <- bglmm(logTime ~ poly(temp.z,1), data=myRL,quiet=TRUE,seed=10018)
> bfit.quad <- bglmm(logTime ~ poly(temp.z,2), data=myRL,quiet=TRUE,seed=10019)
> bfit.cubic <- bglmm(logTime ~ poly(temp.z,3), data=myRL,quiet=TRUE,seed=10020)
> bfit.quartic <- bglmm(logTime ~ poly(temp.z,4), data=myRL,quiet=TRUE,seed=10021)
We should check our informal diagnostics to see how the Markov chain is doing. These models are all doing fine. As an example, here are the diagnostics for the order 4 model.
> summary(bfit.quartic)[,9:10]
                     n_eff  Rhat
(Intercept)           6810 0.999
poly(temp.z,4)1       5980 1.000
poly(temp.z,4)2       6280 1.000
poly(temp.z,4)3       5380 1.000
poly(temp.z,4)4       6060 0.999
sigma0                4380 0.999
sigma.Intercept       4240 1.000
sigma.poly(temp.z,4)  3940 1.000
> plot(bfit.quartic,"trace")

> plot(bfit.quartic,"autocor")
We can use loo() to choose between these models.
> loo(bfit.linear)
[1] -58.44549
> loo(bfit.quad)
Warning: Some Pareto k diagnostic values are too high. See help('pareto-k-diagnostic') for details.
[1] -64.30751
> loo(bfit.cubic)
Warning: Some Pareto k diagnostic values are too high. See help('pareto-k-diagnostic') for details.
[1] -62.00803
> loo(bfit.quartic)
Warning: Some Pareto k diagnostic values are slightly high. See help('pareto-k-diagnostic') for details.
[1] -60.18882
As expected, quadratic is selected. The warning about some values being slightly high merits a further look, but only one is slightly high and nothing is very high, so we are satisfied with these results.
> loo(bfit.quad,verbose=TRUE)
Warning: Some Pareto k diagnostic values are too high. See help('pareto-k-diagnostic') for details.

Computed from 4000 by 37 log-likelihood matrix

         Estimate   SE
elpd_loo     32.2  5.5
p_loo         4.6  1.6
looic       -64.3 11.0
------
Monte Carlo SE of elpd_loo is NA.

Pareto k diagnostic values:
                         Count Pct.    Min. n_eff
(-Inf, 0.5]   (good)     36    97.3%   720       
 (0.5, 0.7]   (ok)        0     0.0%   <NA>      
   (0.7, 1]   (bad)       1     2.7%   359       
   (1, Inf)   (very bad)  0     0.0%   <NA>      
See help('pareto-k-diagnostic') for details.
Orthogonal polynomials are stable and work well, but they are difficult to understand. Let’s redo the quadratic fit with ordinary monomials.
> bfit.quad.monomial <- bglmm(logTime~temp.z+temp.z2,data=myRL,quiet=TRUE,seed=10022)
Warning: There were 3979 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See
https://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded
Warning: Examine the pairs() plot to diagnose sampling problems
Warning: The largest R-hat is 2.66, indicating chains have not mixed.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#r-hat
Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#bulk-ess
Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#tail-ess
Oh my gosh; oh dear, or dear; so many warning messages. Our diagnostics are all waving red flags as well:
> summary(bfit.quad.monomial)[,9:10]
                 n_eff  Rhat
(Intercept)      76.20  1.05
temp.z1          76.40  1.05
temp.z21         76.40  1.05
sigma0          101.00  1.07
sigma.Intercept   7.16  1.32
sigma.temp.z      2.02 11.00
sigma.temp.z2     2.33  3.87
> plot(bfit.quad.monomial,"trace")

> plot(bfit.quad.monomial,"autocor")

Small effective sample sizes and large R. Traces that do not overlap. Correlation slowly decaying to zero. None of these is what we want.

And to add insult to injury, this run of bglmm() was much slower than when we used orthogonal polynomials; the MCMC algorithm is struggling to do a good job and coming up short.

You can sometimes improve the functioning of the Markov chain by using optional arguments to change the behavior of the Markov chain. For example, using the arguments adapt_delta=.99,max_treedepth=15 in the bglmm() command above will produce results that pass our informal diagnostics. However, the chain goes from being “much slower” as remarked above to very, very, very slow (on my computer, each chain took five minutes). The adjusted parameters made the chain functional, but too slow for interactive use.

One of the reasons we have so many problems is that the coefficients for first and second order are highly negatively correlated with each other, and the default Markov chain struggles with that. Here is a scatter plot of the coefficients for first and second order across the 4000 MCMC interations.
> plot(bfit.quad.monomial,"scatter",pars=c("temp.z1","temp.z21"))
Warning: Ignoring unknown parameters: include
Compare that with the corresponding plot for orthogonal polynomials.
> plot(bfit.quad,"scatter",pars=c("poly(temp.z,2)1","poly(temp.z,2)2"))
Warning: Ignoring unknown parameters: include
We can fix the problem without having to resort to orthogonal polynomials: just center temperature by subtracting out some central value, say 210. Then do a quadratic fit with the centered temperatures.
> temp.c <- myRL$temp.z - 210
> temp.c2 <- temp.c^2
> bfit.quad.center <- bglmm(logTime~temp.c+temp.c2,data=myRL,quiet=TRUE,seed=10023)
This is not nearly so slow and works well.
> summary(bfit.quad.center)[,9:10]
                n_eff  Rhat
(Intercept)      3310 1.000
temp.c1          4010 0.999
temp.c21         4210 0.999
sigma0           2630 1.000
sigma.Intercept  2270 1.000
sigma.temp.c     2790 1.000
sigma.temp.c2    3830 1.000
> plot(bfit.quad.center,"trace")

> plot(bfit.quad.center,"autocor")

Recall that \(\textrm{temp.c} = \textrm{temp.z} - 210\), so that \(\textrm{temp.c}^2 = \textrm{temp.z}^2 -420 \textrm{temp.z} + 44100\). Then \(\beta_0 + \beta_1 \textrm{temp.c} + \beta_2 \textrm{temp.c}^2\) is the same as \((\beta_0 -210\beta_1+44100\beta_2) + (\beta_1 -420\beta_2)\textrm{temp.z} + \beta_2 \textrm{temp.z}^2\). This is how we can convert our centered results to ordinary monomials in temperature.

Find the posterior means for each parameter:
> post.means <- summary(bfit.quad.center)[,1]
> post.means
    (Intercept)         temp.c1        temp.c21          sigma0 sigma.Intercept    sigma.temp.c 
       1.41e+00       -1.21e-02        7.83e-05        9.49e-02        1.71e+00        1.41e-02 
  sigma.temp.c2 
       5.97e-03 
Compute the combinations as above:
> beta0 <- post.means[1] -420*post.means[2]+44100*post.means[3]
> beta1 <- post.means[2]-420*post.means[3]
> beta2 <- post.means[3]
> beta0;beta1;beta2
(Intercept) 
    9.94503 
  temp.c1 
-0.044986 
temp.c21 
7.83e-05 
These roughly match up with standard frequentist results:
> lm(logTime~temp.z+temp.z2,myRL)$coef
  (Intercept)        temp.z       temp.z2 
 7.417999e+00 -4.509806e-02  7.860391e-05 
With a bit more effort we can get the credible intervals. Start by extracting the 4000 Markov chain values for the regression coefficients.
> coef.samples <- extract(bfit.quad.center$stanfit)$internal_FullBeta

This is a matrix with three columns and 4000 rows containing all the MCMC samples. The mysterious internal_FullBeta is the name bglmm() gives to all of the non-variance parameters.

Next do the same operations on the (columns of) samples that we did on the means.
> beta0.samples <- coef.samples[,1] -420*coef.samples[,2]+44100*coef.samples[,3]
> beta1.samples <- coef.samples[,2]-420*coef.samples[,3]
> beta2.samples <- coef.samples[,3]
Compare the means here to what we calculated above; they match.
> mean(beta0.samples);mean(beta1.samples);mean(beta2.samples)
[1] 9.930449
[1] -0.04493791
[1] 7.828866e-05
And we can get the credible intervals:
> quantile(beta0.samples,c(.025,.975))
     2.5%     97.5% 
 7.450898 12.461735 
> quantile(beta1.samples,c(.025,.975))
       2.5%       97.5% 
-0.06857376 -0.02211259 
> quantile(beta2.samples,c(.025,.975))
        2.5%        97.5% 
2.408741e-05 1.342222e-04 
And we can compare these to frequentist results:
> confint(lm(logTime~temp.z+temp.z2,myRL))
                    2.5 %        97.5 %
(Intercept)  5.067844e+00  9.7681535196
temp.z      -6.756298e-02 -0.0226331367
temp.z2      2.555875e-05  0.0001316491

Bayesian methods have theoretical strengths, but the issues of implementing them in even simple problems makes their routine use challenging.

21.2 Analysis using BayesFactor

The BayesFactor package can only fit a subset of the models that bglmm() can fit, but it is a very important subset. In particular, BayesFactor requires that the statistical errors be independent, normally distributed with constant variance. That is the standard set of assumptions (for example, the same assumptions that lm() makes), so my hand wringing about the limitations may just reflect my bias toward the function I wrote.

The advantage of BayesFactor is that it uses a special form for the prior distribution that makes computations very fast and MCMC results very stable. The special prior is called a “mixture of g-priors”. If you had done an ordinary linear model fit instead of a Bayesian fit, your estimated parameters would have a covariance that is the mean square for error times a certain matrix. The mixture of g-priors says that the prior for the parameters is normal with mean 0 and a covariance that is some multiple of the covariance your non-Bayesian fit would have obtained. BayesFactor puts a hyper-prior on the variance multiplier \(g\) and samples across that as well. In a sense, a \(g\) of 10 says that your prior is contributing about 10% as much information as the data. There is nothing philosophically compelling about this prior; it just makes the computations easy.

The package has several functions to fit models of several types and return a variety of Bayes factors for a variety of submodels. We will use the function lmBF().

21.2.1 Basic Treatment Effects Model

Begin by getting the Bayes Factor comparing the separate means model to the single mean model. By default, lmBF() gives you the Bayes Factor for the model you specify relative to the single mean (intercept only) model.
> set.seed(20221025) # for repeatability
> BF.separate <- lmBF(logTime~temp,data=myRL)
> BF.separate
Bayes factor analysis
--------------
[1] temp : 1.561215e+14 ±0%

Against denominator:
  Intercept only 
---
Bayes factor type: BFlinearModel, JZS
The Bayes factor is larger than \(10^{14}\), so there is overwhelming evidence favoring the separate means model. The Bayes factor is sensitive to the prior. We have a very limited ability to change the priors using lmBF(); we can set rscaleFixed to “medium” (the default), “wide”, or “ultrawide”. In this case, there is a factor of 3 change in the Bayes factor:
> lmBF(logTime~temp,data=myRL,rscaleFixed="ultrawide")
Bayes factor analysis
--------------
[1] temp : 3.00167e+14 ±0%

Against denominator:
  Intercept only 
---
Bayes factor type: BFlinearModel, JZS
To learn anything about the parameters, we must first get samples from the posterior via the posterior() function. We choose 4000 iterations simply to match what we will be using later with bglmm().
> samples <- posterior(BF.separate,iterations=4000)
Before anything else, look at the output to see if there is any evidence of poor mixing in the chain. The trace plots should look like like a horizontal band of black without a lot of trends up and down. These all look fine.
> plot(samples[,1:2])

> plot(samples[,3:4])

> plot(samples[,5:6])

> plot(samples[,7:8])
The next step is to look at the autocorrelations (time lag correlations) for each variable. We want these to decay to 0 as quickly as possible. These are all fine.
> tmp <- apply(samples,2,acf) # we just want plots, so assign output to variable

To learn about the parameters, look at summary() of the Markov chain samples. The summary has two parts. The first part is the mean, standard deviation, and standard error for each of the variables in the model, with one row for each variable. The standard deviation reflects the posterior variability in the parameter; the standard error reflects how precisely we know the mean. The “Time-series SE” attempts to correct the standard error for autocorrelation; there is so little autocorrelation in these samples that the naive and corrected standard errors are practically the same.

The second part of the summary shows percentiles of the posterior samples for each variable. We can use the 2.5 and 97.5 percentiles to form interval estimates (posterior credible intervals).
> summary(samples)

Iterations = 1:4000
Thinning interval = 1 
Number of chains = 1 
Sample size per chain = 4000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

             Mean        SD  Naive SE Time-series SE
mu        1.43916  0.016537 2.615e-04      2.615e-04
temp-175  0.48719  0.032657 5.163e-04      5.633e-04
temp-194  0.18820  0.031451 4.973e-04      4.973e-04
temp-213 -0.06080  0.031925 5.048e-04      5.048e-04
temp-231 -0.24002  0.033191 5.248e-04      5.248e-04
temp-250 -0.37456  0.035398 5.597e-04      5.597e-04
sig2      0.01011  0.002768 4.376e-05      6.202e-05
g_temp   16.85883 20.184005 3.191e-01      3.321e-01

2. Quantiles for each variable:

              2.5%       25%       50%      75%     97.5%
mu        1.407301  1.428078  1.439117  1.45004  1.472119
temp-175  0.422406  0.465572  0.487809  0.50903  0.549252
temp-194  0.126188  0.166841  0.187807  0.20933  0.249878
temp-213 -0.125322 -0.082224 -0.060307 -0.03961  0.001883
temp-231 -0.304259 -0.261915 -0.240839 -0.21842 -0.174631
temp-250 -0.443623 -0.398432 -0.374979 -0.35128 -0.304086
sig2      0.005987  0.008183  0.009644  0.01160  0.016739
g_temp    3.308584  7.234859 11.564877 19.18441 65.205978
It is instructive to compare the Bayesian results with frequentist results. For this model and these data, the results are essentially the same; this is not always true.
> confint(lm(logTime~temp,data=ResinLifetimes))
                 2.5 %       97.5 %
(Intercept)  1.4056502  1.470230797
temp1        0.4321203  0.556998772
temp2        0.1283703  0.253248772
temp3       -0.1228797  0.001998772
temp4       -0.3092799 -0.178029622

21.2.2 Polynomial models

Temperature is a quantitative variable, so we can model the response as a function of temperature. We begin by fitting polynomial models of order 1 through 4 using ordinary polynomials and checking the Bayes factors.
> BF.p1 <- lmBF(logTime~temp.z,data=myRL)
> BF.p2 <- lmBF(logTime~temp.z+temp.z2,data=myRL)
> BF.p3 <- lmBF(logTime~temp.z+temp.z2+temp.z3,data=myRL)
> BF.p4 <- lmBF(logTime~temp.z+temp.z2+temp.z3+temp.z4,data=myRL)
> c(BF.p1,BF.p2,BF.p3,BF.p4)
Bayes factor analysis
--------------
[1] temp.z                               : 9.149864e+15 ±0.01%
[2] temp.z + temp.z2                     : 3.15165e+16  ±0%
[3] temp.z + temp.z2 + temp.z3           : 2.577789e+15 ±0%
[4] temp.z + temp.z2 + temp.z3 + temp.z4 : 2.522228e+14 ±0%

Against denominator:
  Intercept only 
---
Bayes factor type: BFlinearModel, JZS
Each of these is the Bayes factor relative the intercept-only model. The model with the highest Bayes factor is preferred; here that is the quadratic, as expected. We also see that the cubic and quartic models are not really in the running at all. To get the Bayes factor comparing quadratic to linear, take their ratio.compare the quadratic to the linear, take the ratio of their Bayes factors:
> BF.p2/BF.p1
Bayes factor analysis
--------------
[1] temp.z + temp.z2 : 3.444477 ±0.01%

Against denominator:
  logTime ~ temp.z 
---
Bayes factor type: BFlinearModel, JZS

There is a positive preference for the quadratic over the linear, but the size of the Bayes factor is smaller than I would have expected.

We remark on the obvious, which is that the BayesFactor package was able to fit the standard monomial parameterization of the polynomials, but bglmm() could not. This is another gift of the g-prior. On the other hand, orthogonal polynomials don’t work with lmBF().
> lmBF(logTime~poly(temp.z,4),data=myRL)
Error in `[.data.frame`(data, , vars): undefined columns selected
To look at the parameters, get posterior samples and then summarize them.
> samples <- posterior(BF.p2,iterations=4000)
> summary(samples)

Iterations = 1:4000
Thinning interval = 1 
Number of chains = 1 
Sample size per chain = 4000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

              Mean        SD  Naive SE Time-series SE
mu       1.465e+00 1.625e-02 2.569e-04      2.569e-04
temp.z  -4.451e-02 1.263e-02 1.997e-04      1.997e-04
temp.z2  7.742e-05 2.992e-05 4.730e-07      4.730e-07
sig2     9.497e-03 4.605e-03 7.282e-05      9.223e-05
g        1.041e+01 2.876e+01 4.548e-01      4.548e-01

2. Quantiles for each variable:

              2.5%        25%        50%        75%     97.5%
mu       1.435e+00  1.455e+00  1.465e+00  1.4750283  1.496471
temp.z  -6.639e-02 -5.233e-02 -4.483e-02 -0.0369712 -0.022026
temp.z2  2.422e-05  5.972e-05  7.808e-05  0.0000959  0.000129
sig2     5.829e-03  7.744e-03  9.021e-03  0.0106433  0.015591
g        1.037e+00  2.545e+00  4.618e+00  9.1681394 56.675281

The intervals for the linear and quadratic coefficients are close to what we obtained (more laboriously) using bglmm().

For completeness, we really should look at the MCMC samples to see if there are problems. We see an odd value or two at the beginning of the samples, but everything settles down with no obvious problems.
> plot(samples[,1:2])

> plot(samples[,3:4])

> tmp <- apply(samples,2,acf)