Nonparametric Bootstrap
The data set
is data of (Xi, Yi) pairs, from which we wish to estimate the correlation coefficient and get some idea of its sampling distribution.
The following R statements do a nonparametric bootstrap estimate of the sampling distribution of the correlation coefficient.
The R functions sample
and sample.int
(on-line
help)
sample with or without replacement from a finite population.
Here we use sample.int
which samples from the integers
from one to n (the data sample size).
Applying the result to the original data, we get bootstrap data
x.star
and y.star
which we use to calculate
one random realization the estimator theta.star[i]
each time
through the loop.
The histogram shows the sampling distribution of
The simplest method of making confidence intervals for the unknown parameter
is to take α ⁄ 2
and 1 − α ⁄ 2 quantiles of the bootstrap distribution
of
Many different methods of making bootstrap confidence intervals have
been proposed, far too many to cover in this course. The course
on nonparametric inference (Stat 5601) usually covers them.
Here are some web pages from the last time your instructor taught that
course. These cover some but by no means all the methods.
The bootstrap doesn't do hypothesis tests in general, the reason being
that the bootstrap has no general way to sample from (an analog of) the
null hypothesis when the null hypothesis is not true. The bootstrap
simulates from (an analog of) the true unknown distribution. Hence when
the alternative hypothesis is true, the bootstrap samples from (an analog
of) the alternative hypothesis. Not what is wanted.
In special situations, one can cook up a bootstrap-like procedure that
can be claimed to simulate from (an analog of) the null hypothesis.
But there is no general procedure for that.
One can always invert bootstrap confidence intervals to perform a hypothesis
test about the parameter the confidence interval is for. This is a simple
application of the duality of tests and confidence intervals
(slide 206, deck 2).
Here is a web page from the last time your instructor taught Stat 5601
covering that.
Here is our example of
confidence intervals for mean values for
a generalized linear model redone using the parametric bootstrap.
The data set
contains two variables the response
The following R statements fit the model and do a parametric bootstrap
of the mean value for an individual whose
From the histogram of the parametric bootstrap distribution of the
estimator, we see we are a long way from asymptopia.
The generally accepted way to make parametric bootstrap confidence intervals
is via bootstrap t procedures, which are analogous to t
confidence intervals when the data are assumed normal.
Here is a web page from the last time your instructor taught Stat 5601
covering that.
theta.star
which is assumed to be close to the sampling distribution of the actual
estimator. More precisely the distribution of theta.hat
−
theta
is assumed to be close to the distribution of
theta.star
− theta.hat
.
Bootstrap Percentile Intervals
theta.star
.
Other Bootstrap Confidence Intervals
Bootstrap Hypothesis Tests
Parametric Bootstrap
y
, which is Bernoulli,
and the predictor x
which is quantative and the distribution
of which doesn't matter, since we condition on it.
x
value is 25.
Bootstrap T Intervals