next up previous
Up: Stat 5132

Midterm 2

Problem 1

(a)

This is a special case of homework problem 8-4 in the notes. The problem statement says that

\begin{displaymath}
\mu = \bar{x} + \frac{s}{\sqrt{n}} T,\end{displaymath}

where $T \sim t(n - 1)$. (Here $\bar{x}$ and s are fixed numbers calculated from the data, and $\mu$ denotes a random variable with the posterior distribution of $\mu$ given the data.)

The distribution of T is symmetric about zero, so E(T) = 0 when the expectation exists (which the problem says to assume). Thus by linearity of expectation $E(\mu \vert \text{data}) = \bar{x}$.

(b)

Same as part (a). When a distribution has a center of symmetry, then the center of symmetry is the the median as well the mean when the mean exists. Thus $\bar{x}$ is also the posterior median.

Problem 2

(a)

The likelihood is

\begin{displaymath}
L(\mu) = \mu^x e^{- \mu}.\end{displaymath}

The prior is

\begin{displaymath}
g(\mu) \propto \mu^{\alpha - 1} e^{- \beta \mu}\end{displaymath}

omitting constants that are functions of $\alpha$ and $\beta$ but not $\mu$(normalizing constants of the prior cancel out in calculating the posterior). Thus the unnormalized posterior is

\begin{displaymath}
h(\mu \vert x) \propto
 \mu^x e^{- \mu}
 \mu^{\alpha - 1} e^{- \beta \mu}
 =
 \mu^{x + \alpha - 1} e^{- (1 + \beta) \mu}\end{displaymath}

This is an unnormalized $\text{Gam}(x + \alpha, 1 + \beta)$ density. Thus that is the posterior distribution of $\mu$ given X.

(b)

The mean of a $\text{Gam}(a, b)$ distribution is a / b (equation (7) on p. 174 in Lindgren). Hence the mean here is

\begin{displaymath}
E(\mu \vert x) = \frac{x + \alpha}{1 + \beta}\end{displaymath}

Problem 3

(a)

Part (a) of this problem is essentially the same as Problem 3 on the first midterm. The p. d. f. in this problem is obtained from the p. d. f. in that problem by the transformation y = 1 / x. The MLE in that problem was found to be

\begin{displaymath}
\hat{\theta}_n(\mathbf{X}) = - \frac{1}{\frac{1}{n} \sum_{i = 1}^n \log X_i}\end{displaymath}

If we write the data for this problem as Yi = 1 / Xi, then the MLE in terms of the Yi is

\begin{displaymath}
\hat{\theta}_n(\mathbf{Y}) = \frac{1}{\frac{1}{n} \sum_{i = 1}^n \log Y_i}\end{displaymath}

It was not expected that anyone actually do the problem this way. We only mention this to explain why the following solution should seem familiar.

The joint p. d. f. is
\begin{align*}
f_\theta(\mathbf{x})
 & =
 \prod_{i = 1}^n \theta x_i^{- (\theta ...
 ... \  & =
 \theta^n \left(\prod_{i = 1}^n x_i\right)^{- (\theta + 1)}\end{align*}
Hence the log likelihood is
\begin{align*}
l_n(\theta)
 & =
 n \log(\theta) - (\theta + 1) \log\left(\prod_{...
 ...ht)
 \  & =
 n \log(\theta) - (\theta + 1) \sum_{i = 1}^n \log(x_i)\end{align*}
To simplify notation, define

\begin{displaymath}
W_n = \frac{1}{n} \sum_{i = 1}^n \log(X_i)\end{displaymath}

so the log likelihood becomes
\begin{align*}
l_n(\theta)
 & =
 n \log(\theta) - (\theta + 1) n w_n
 \  & =
 n \log(\theta) - n \theta w_n - n w_n\end{align*}
If we prefer, we can drop the last term, which does not contain the parameter, obtaining

\begin{displaymath}
l_n(\theta)
 =
 n \log(\theta) - n \theta w_n\end{displaymath}

The score function is

\begin{displaymath}
l_n'(\theta) = \frac{n}{\theta} - n w_n\end{displaymath}

Solving the ``likelihood equation'' $l_n'(\theta) = 0$ gives

\begin{displaymath}
\hat{\theta}_n(\mathbf{X}) = \frac{1}{W_n}\end{displaymath}

(b)

We have two options for calculating Fisher information. We can calculate the variance of $l_n'(\theta)$, or we can calculate the negative expectation of the second derivative of the log likelihood

\begin{displaymath}
l_n''(\theta) = - \frac{n}{\theta^2}.\end{displaymath}

The second option is much simpler, because $l_n''(\theta)$ does not contain any random variables so the expectation is trivial (expectation of a constant is the constant). Thus the expected Fisher information is

\begin{displaymath}
I_n(\theta) = - E\{l_n''(\theta)\} = \frac{n}{\theta^2}\end{displaymath}

(c)

The general formula for a 95% confidence interval for the parameter based on the asymptotic distribution of the MLE is

\begin{displaymath}
\hat{\theta}_n \pm 1.96 \frac{1}{\sqrt{I_n(\hat{\theta}_n)}}\end{displaymath}

which in this case works out to

\begin{displaymath}
\hat{\theta}_n \pm 1.96 \frac{\hat{\theta}_n}{\sqrt{n}}\end{displaymath}

Problem 4

An exact C. I. is derived using the pivotal quantity having the exact sampling distribution

\begin{displaymath}
\frac{(n - 1) S_n^2}{\sigma^2} \sim \text{chi}^2(n - 1)\end{displaymath}

found in Theorem 11 of Section 7.8 in Lindgren.

An equal-tailed confidence interval uses points a* and b* such that $P(a^* < \chi^2_{n-1} < b^*) = 95\%$. The confidence interval is the interval of $\sigma^2$ satisfying

\begin{displaymath}
a^* < \frac{(n - 1) S_n^2}{\sigma^2} < b^*\end{displaymath}

that is,

\begin{displaymath}
\frac{(n - 1) S_n^2}{b^*} < \sigma^2 < \frac{(n - 1) S_n^2}{a^*}\end{displaymath}

Looking in Table Vb in the row for 4 d. f. and in the columns for 2.5% and 97.5%, we find a* = 0.484 and b* = 11.1.

Since

\begin{displaymath}
\frac{(n - 1) S_n^2}{b^*}
 =
 \frac{4 \times 53.3}{11.1}
 =
 19.3\end{displaymath}

and

\begin{displaymath}
\frac{(n - 1) S_n^2}{a^*}
 =
 \frac{4 \times 53.3}{0.484}
 =
 442\end{displaymath}

An exact 95% C. I. for $\sigma^2$ is (19.2, 440).

Alternative Solutions

The problem did not explicitly ask for equal-tailed confidence intervals. The problem could have been answered with a semi-infinite interval like Example 8.9a in Lindgren, i. e., what you get when you take a* = 0 and b* = 9.45 (from the 95% column) or when you take a* = 0.711 (from the 5% column) and $b^* = \infty$. The first choice gives an exact 95% C. I. for $\sigma^2$ of $(22.6, \infty)$, and the second choice gives an exact 95% C. I. for $\sigma^2$ of (0, 301).

Problem 5

There are zillions of consistent estimators. The question is to find one. The MLE is consistent but impossible to find in closed form, because the score function involves derivatives of gamma functions. Thus we turn to the method of moments. The mean and variance of the $\text{Beta}(s, t)$distribution are
\begin{align*}
E(X) & = \frac{s}{s + t} \  \mathop{\rm var}\nolimits(X) & = \frac{s t}{(s + t + 1) (s + t)^2}\end{align*}
(p. 176 in Lindgren). Plugging $s = t = \theta$ gives
\begin{align*}
E(X) & = \frac{1}{2} \  \mathop{\rm var}\nolimits(X) & = \frac{1}{4 (2 \theta + 1)}\end{align*}
The first equation is useless because it does not contain $\theta$.Solving the second for $\theta$ gives

\begin{displaymath}
\theta = \frac{1}{8 \mathop{\rm var}\nolimits(X)} - \frac{1}{2}\end{displaymath}

Plugging in the empirical variance gives a method of moments estimator

\begin{displaymath}
\hat{\theta}_n = \frac{1}{8 S_n^2} - \frac{1}{2}\end{displaymath}

A slightly better estimator uses the fact that we know the mean is $\frac{1}{2}$, so a better estimator of the variance is

\begin{displaymath}
A_n = \frac{1}{n} \sum_{i = 1}^n (X_i - \tfrac{1}{2})^2\end{displaymath}

and replacing Sn2 by An in the definition of $\hat{\theta}_n$ above improves the estimate. But either estimator is consistent, which is all the question asked.


next up previous
Up: Stat 5132
Charles Geyer
3/16/1999