Saturday, 23 May 2015

Parametric Inference: Likelihood Ratio Test Problem 2

More on Likelihood Ratio Test, the following problem is originally from Casella and Berger (2001), exercise 8.12.

Problem

For samples of size $n=1,4,16,64,100$ from a normal population with mean $\mu$ and known variance $\sigma^2$, plot the power function of the following LRTs (Likelihood Ratio Tests). Take $\alpha = .05$.
  1. $H_0:\mu\leq 0$ versus $H_1:\mu>0$
  2. $H_0:\mu=0$ versus $H_1:\mu\neq 0$

Solution

  1. The LRT statistic is given by $$ \lambda(\mathbf{x})=\frac{\displaystyle\sup_{\mu\leq 0}\mathcal{L}(\mu|\mathbf{x})}{\displaystyle\sup_{-\infty<\mu<\infty}\mathcal{L}(\mu|\mathbf{x})}, \;\text{since }\sigma^2\text{ is known}. $$ The denominator can be expanded as follows: $$ \begin{aligned} \sup_{-\infty<\mu<\infty}\mathcal{L}(\mu|\mathbf{x})&=\sup_{-\infty<\mu<\infty}\prod_{i=1}^{n}\frac{1}{\sqrt{2\pi}\sigma}\exp\left[-\frac{(x_i-\mu)^2}{2\sigma^2}\right]\\ &=\sup_{-\infty<\mu<\infty}\frac{1}{(2\pi\sigma^2)^{1/n}}\exp\left[-\displaystyle\sum_{i=1}^{n}\frac{(x_i-\mu)^2}{2\sigma^2}\right]\\ &=\frac{1}{(2\pi\sigma^2)^{1/n}}\exp\left[-\displaystyle\sum_{i=1}^{n}\frac{(x_i-\bar{x})^2}{2\sigma^2}\right],\\ &\quad\text{since }\bar{x}\text{ is the MLE of }\mu.\\ &=\frac{1}{(2\pi\sigma^2)^{1/n}}\exp\left[-\frac{n-1}{n-1}\displaystyle\sum_{i=1}^{n}\frac{(x_i-\bar{x})^2}{2\sigma^2}\right]\\ &=\frac{1}{(2\pi\sigma^2)^{1/n}}\exp\left[-\frac{(n-1)s^2}{2\sigma^2}\right],\\ \end{aligned} $$ while the numerator is evaluated as follows: $$ \begin{aligned} \sup_{\mu\leq 0}\mathcal{L}(\mu|\mathbf{x})&=\sup_{\mu\leq 0}\prod_{i=1}^{n}\frac{1}{\sqrt{2\pi}\sigma}\exp\left[-\frac{(x_i-\mu)^2}{2\sigma^2}\right]\\ &=\sup_{\mu\leq 0}\frac{1}{(2\pi\sigma^2)^{1/n}}\exp\left[-\displaystyle\sum_{i=1}^{n}\frac{(x_i-\mu)^2}{2\sigma^2}\right]. \end{aligned} $$ Above expression will attain its maximum if the value inside the exponential function is small. And for negative values of $\mu\in(-\infty,0)$ the quantity $(x_i-\mu)^2$ would be large, implies that the exponential term would become small. Therefore, the only value that will give us the supremum likelihood is $\mu=\mu_0=0$. Hence, $$ \begin{aligned} \sup_{\mu\leq 0}\mathcal{L}(\mu|\mathbf{x})&=\frac{1}{(2\pi\sigma^2)^{1/n}}\exp\left[-\displaystyle\sum_{i=1}^{n}\frac{(x_i-\mu_0)^2}{2\sigma^2}\right]\\ =\frac{1}{(2\pi\sigma^2)^{1/n}}&\exp\left[-\displaystyle\sum_{i=1}^{n}\frac{(x_i-\bar{x}+\bar{x}-\mu_0)^2}{2\sigma^2}\right]\\ =\frac{1}{(2\pi\sigma^2)^{1/n}}&\exp\left\{-\displaystyle\sum_{i=1}^{n}\left[\frac{(x_i-\bar{x})^2+2(x_i-\bar{x})(\bar{x}-\mu_0)+(\bar{x}-\mu_0)^2}{2\sigma^2}\right]\right\}\\ =\frac{1}{(2\pi\sigma^2)^{1/n}}&\exp\left[-\frac{(n-1)s^2+n(\bar{x}-\mu_0)^2}{2\sigma^2}\right], \\ &\text{since the middle term is 0.}\\ =\frac{1}{(2\pi\sigma^2)^{1/n}}&\exp\left[-\frac{(n-1)s^2+n\bar{x}^2}{2\sigma^2}\right], \text{since }\mu_0=0.\\ \end{aligned} $$ So that $$ \begin{equation} \label{eq:lrtre} \begin{aligned} \lambda(\mathbf{x})&=\frac{\frac{1}{(2\pi\sigma^2)^{1/n}}\exp\left[-\frac{(n-1)s^2+n\bar{x}^2}{2\sigma^2}\right]}{\frac{1}{(2\pi\sigma^2)^{1/n}}\exp\left[-\frac{(n-1)s^2}{2\sigma^2}\right]}\\ &=\exp\left[-\frac{n\bar{x}^2}{2\sigma^2}\right].\\ \end{aligned} \end{equation} $$ And we reject the null hypothesis if $\lambda(\mathbf{x})\leq c$, that is $$ \begin{aligned} \exp\left[-\frac{n\bar{x}^2}{2\sigma^2}\right]&\leq c\\ -\frac{n\bar{x}^2}{2\sigma^2}&\leq \log c\\ \frac{\lvert\bar{x}\rvert}{\sigma/\sqrt{n}}&\geq\sqrt{-2\log c}=c'. \end{aligned} $$
    Figure 1: Plot of Likelihood Ratio Test Statistic for $n = 4,\sigma = 1$.

    Hence, rejecting the null hypothesis if $\lambda(\mathbf{x})\leq c$, is equivalent to rejecting $H_0$ if $\frac{\bar{x}}{\sigma/\sqrt{n}}\geq c'\in[0,\infty)$. Figure 1 depicts the plot of the LRT, the shaded region is on the positive side because that's where the alternative region is, $H_1:\mu>0$, in a sense that if the LRT is small enough to reject $H_0$, then it simply tells us that the plausibility of the parameter in the alternative in explaining the sample is higher compared to the null hypothesis. And if that's the case, we expect the sample to come from the model proposed by $H_1$, so that the sample mean $\bar{x}$, being an unbiased estimator of the population mean $\mu$, a function of the LRT statistic, should fall on the side (shaded region) of the alternative.

    So that the power function, that is the probability of rejecting the null hypothesis given that it is true (the probability of Type I error) is, $$ \begin{aligned} \beta(\mu)&=\mathrm{P}\left[\frac{\bar{x}-\mu_0}{\sigma/\sqrt{n}}\geq c'\right],\quad\mu_0=0\\ &=1-\mathrm{P}\left[\frac{\bar{x}+\mu-\mu-\mu_0}{\sigma/\sqrt{n}}< c'\right]\\ &=1-\mathrm{P}\left[\frac{\bar{x}-\mu}{\sigma/\sqrt{n}} + \frac{\mu-\mu_0}{\sigma/\sqrt{n}}< c'\right]\\ &=1-\mathrm{P}\left[\frac{\bar{x}-\mu}{\sigma/\sqrt{n}}< c'+ \frac{\mu_0-\mu}{\sigma/\sqrt{n}}\right]\\ &=1-\Phi\left[c'+ \frac{\mu_0-\mu}{\sigma/\sqrt{n}}\right]. \end{aligned} $$ Values taken by $\Phi$ are negative and so it decreases, but since we subtracted it to 1, then $\beta(\mu)$ is an increasing function. So that for $\alpha=.05$, $$ \begin{aligned} \alpha&=\sup_{\mu\leq \mu_0}\beta(\mu)\\ .05&=\beta(\mu_0)\Rightarrow\beta(\mu_0)=1-\Phi(c')\\ .95&=\Phi(c')\Rightarrow c'=1.645. \end{aligned} $$ Since, $$ \begin{aligned} \Phi(1.645)=\int_{-\infty}^{1.645}\frac{1}{\sqrt{2\pi}}\exp\left[-\frac{x^2}{2}\right]\operatorname{d}x=.9500151. \end{aligned} $$ Therefore for $c'=1.645,\mu_0=0,\sigma=1$, the plot of the power function as a function of $\mu$ for different sample size, $n$, is shown in Figure 2. For example, for $n=1$ we compute for the function \begin{equation} \label{eq:powcomp} \begin{aligned} \beta(\mu)&=1-\Phi\left[c'+ \frac{\mu_0-\mu}{\sigma/\sqrt{n}}\right]\\ &=1-\Phi\left[1.645+ \frac{0-\mu}{1/\sqrt{1}}\right]\\ &=1-\int_{-\infty}^{\left(1.645+ \frac{0-\mu}{1/\sqrt{1}}\right)}\frac{1}{\sqrt{2\pi}}\exp\left[-\frac{x^2}{2}\right]\operatorname{d}x. \end{aligned} \end{equation} The obtained values would be the $y$. For $n = 64$ $$ \begin{aligned} \beta(\mu)&=1-\Phi\left[c'+ \frac{\mu_0-\mu}{\sigma/\sqrt{n}}\right]\\ &=1-\Phi\left[1.645+ \frac{0-\mu}{1/\sqrt{64}}\right]\\ &=1-\int_{-\infty}^{\left(1.645+ \frac{0-\mu}{1/\sqrt{64}}\right)}\frac{1}{\sqrt{2\pi}}\exp\left[-\frac{x^2}{2}\right]\operatorname{d}x, \end{aligned} $$ and so on.
    Figure 2: Power Function for Different Values of $n$.

  2. The LRT statistic is given by $$ \lambda(\mathbf{x})=\frac{\displaystyle\sup_{\mu= 0}\mathcal{L}(\mu|\mathbf{x})}{\displaystyle\sup_{-\infty<\mu<\infty}\mathcal{L}(\mu|\mathbf{x})}, \;\text{since }\sigma^2\text{ is known}. $$ The denominator can be expanded as follows: $$ \begin{aligned} \sup_{-\infty<\mu<\infty}\mathcal{L}(\mu|\mathbf{x})&=\sup_{-\infty<\mu<\infty}\prod_{i=1}^{n}\frac{1}{\sqrt{2\pi}\sigma}\exp\left[-\frac{(x_i-\mu)^2}{2\sigma^2}\right]\\ &=\sup_{-\infty<\mu<\infty}\frac{1}{(2\pi\sigma^2)^{1/n}}\exp\left[-\displaystyle\sum_{i=1}^{n}\frac{(x_i-\mu)^2}{2\sigma^2}\right]\\ &=\frac{1}{(2\pi\sigma^2)^{1/n}}\exp\left[-\displaystyle\sum_{i=1}^{n}\frac{(x_i-\bar{x})^2}{2\sigma^2}\right],\\ &\quad\;\text{since }\bar{x}\text{ is the MLE of }\mu.\\ &=\frac{1}{(2\pi\sigma^2)^{1/n}}\exp\left[-\frac{n-1}{n-1}\displaystyle\sum_{i=1}^{n}\frac{(x_i-\bar{x})^2}{2\sigma^2}\right]\\ &=\frac{1}{(2\pi\sigma^2)^{1/n}}\exp\left[-\frac{(n-1)s^2}{2\sigma^2}\right],\\ \end{aligned} $$ and the numerator is evaluated as follows: $$ \begin{aligned} \sup_{\mu=0}\mathcal{L}(\mu|\mathbf{x})&=\sup_{\mu=0}\prod_{i=1}^{n}\frac{1}{\sqrt{2\pi}\sigma}\exp\left[-\frac{(x_i-\mu)^2}{2\sigma^2}\right]\\ &=\sup_{\mu=0}\frac{1}{(2\pi\sigma^2)^{1/n}}\exp\left[-\displaystyle\sum_{i=1}^{n}\frac{(x_i-\mu)^2}{2\sigma^2}\right]\\ &=\frac{1}{(2\pi\sigma^2)^{1/n}}\exp\left[-\displaystyle\sum_{i=1}^{n}\frac{(x_i-0)^2}{2\sigma^2}\right]\\ &=\frac{1}{(2\pi\sigma^2)^{1/n}}\exp\left[-\frac{(n-1)s^2+n\bar{x}^2}{2\sigma^2}\right], \end{aligned} $$ we skip some lines in the above simplification since we've done this already in part (a). And by Equation (1), $\lambda(\mathbf{x})=\exp\left[-\frac{n\bar{x}^2}{2\sigma^2}\right]$. So that $\lambda(\mathbf{x})\leq c$ would be $$ \begin{aligned} \exp\left[-\frac{n\bar{x}^2}{2\sigma^2}\right]&\leq c\\ -\frac{n\bar{x}^2}{2\sigma^2}&\leq \log c\\ \frac{\lvert\bar{x}-\mu_0\rvert}{\sigma/\sqrt{n}}&\geq\sqrt{-2\log c}=c',\quad \mu_0=0. \end{aligned} $$ So rejecting the null hypothesis if $\lambda(\mathbf{x})\leq c'$ is equivalent to rejecting $H_0$ if $\frac{\lvert\bar{x}\rvert}{\sigma/\sqrt{n}}\geq c'$. And since $H_1$ is two-sided, then we reject $H_0$ if $\frac{\bar{x}}{\sigma/\sqrt{n}}\geq c'$ or $\frac{\bar{x}}{\sigma/\sqrt{n}}\leq -c'$. To illustrate this, consider Figure 3 where the two shaded regions are the lower and upper rejection regions.
    Figure 3: Plot of Likelihood Ratio Test Statistic for $n = 4,\sigma = 1$.

    So that the power function is, $$ \begin{aligned} \beta(\mu)&=\mathrm{P}\left[\frac{\lvert\bar{x}\rvert}{\sigma/\sqrt{n}}\geq c'\right]\\ &=1 - \mathrm{P}\left[\frac{\lvert\bar{x}\rvert}{\sigma/\sqrt{n}}< c'\right]\\ &=1 - \mathrm{P}\left[-c'<\frac{\bar{x}}{\sigma/\sqrt{n}}< c'\right]\\ &=1 - \left\{\mathrm{P}\left[\frac{\bar{x}}{\sigma/\sqrt{n}}< c'\right]-\mathrm{P}\left[\frac{\bar{x}}{\sigma/\sqrt{n}}< -c'\right]\right\}\\ &=1 - \left\{\mathrm{P}\left[\frac{\bar{x}+\mu-\mu}{\sigma/\sqrt{n}}< c'\right]-\mathrm{P}\left[\frac{\bar{x}+\mu-\mu}{\sigma/\sqrt{n}}< -c'\right]\right\}\\ &=1 - \mathrm{P}\left[\frac{\bar{x}-\mu}{\sigma/\sqrt{n}}< c'-\frac{\mu}{\sigma/\sqrt{n}}\right]+\mathrm{P}\left[\frac{\bar{x}-\mu}{\sigma/\sqrt{n}}< -c'-\frac{\mu}{\sigma/\sqrt{n}}\right]\\ &=\underbrace{1 - \Phi\left[c'-\frac{\mu}{\sigma/\sqrt{n}}\right]}_{\Phi_1}+\underbrace{\Phi\left[-c'-\frac{\mu}{\sigma/\sqrt{n}}\right]}_{\Phi_2}. \end{aligned} $$ Notice that $\Phi_1$ is an increasing function, while $\Phi_2$ is decreasing as a function of $\mu$. We expect this since the alternative hypothesis is a two-sided one, so does the power. To see this, consider Figure 4 for different values of $n$.
    Figure 4: Two-Sided Power Function for Different $n$.

    The points in the plot are computed by substituting values of $\mu=0,\sigma=1$ and $n$ to the power function just like we did in Equation (2).

Reference

  1. Casella, G. and Berger, R.L. (2001). Statistical Inference. Thomson Learning, Inc.
  2. Felix Schönbrodt. Shading regions of the normal: The Stanine scale. Retrieved May 2015.

No comments:

Post a Comment