信賴區間
confidence intervals | hypothesis tests | |
\(\mu\) \(\sigma\) known |
\(\overline{x}\pm z_{\alpha/2}\frac{\sigma}{\sqrt{n}}\) (8.1) proof |
\(z=\frac{\overline{x}-\mu_0}{\sigma/\sqrt{n}}\) (9.1) |
\(\mu\) \(\sigma\) unknown |
\(\overline{x}\pm t_{\alpha/2}(n-1)\frac{s}{\sqrt{n}}\) (8.2) proof |
\(t=\frac{\overline{x}-\mu_0}{s/\sqrt{n}}\) (9.2) |
\(\mu_1-\mu_2\) \(\sigma_1, \sigma_2\) known |
\((\overline{x}_1-\overline{x}_2)\pm z_{\alpha/2}\sqrt{\frac{\sigma_1^2}{n_1}+\frac{\sigma_2^2}{n_2}}\) (10.4) proof |
\(z=\frac{(\overline{x}_1-\overline{x}_2)-D_0}{\sqrt{\frac{\sigma_1^2}{n_1}+\frac{\sigma_2^2}{n_2}}}\) (10.5) |
\(\mu_1-\mu_2\) \(\sigma_1=\sigma_2\) unknown |
\((\overline{x}_1-\overline{x}_2)\pm t_{\alpha/2}(n+m-2)\sqrt{\frac{(n_1-1)s_1^2+(n_2-1)s_2^2}{n_1+n_2-2}}\sqrt{\frac{1}{n_1}+\frac{1}{n_2}}\) proof |
|
\(\mu_1-\mu_2\) \(\sigma_1, \sigma_2\) unknown |
\((\overline{x}_1-\overline{x}_2)\pm t_{\alpha/2}(\text{d.f.})\sqrt{\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}}\) (10.6) proof \(\text{d.f.}=\frac{\left(\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}\right)^2}{\frac{1}{n_1-1}\left(\frac{s_1^2}{n_1}\right)^2+\frac{1}{n_2-1}\left(\frac{s_2^2}{n_2}\right)^2}\) (10.7) |
\(t=\frac{(\overline{x}_1-\overline{x}_2)-D_0}{\sqrt{\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}}}\) (10.8) \(\text{d.f.}=\frac{\left(\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}\right)^2}{\frac{1}{n_1-1}\left(\frac{s_1^2}{n_1}\right)^2+\frac{1}{n_2-1}\left(\frac{s_2^2}{n_2}\right)^2}\) (10.7) |
\(p\) | \(\overline{p}\pm z_{\alpha/2}\sqrt{\frac{\overline{p}(1-\overline{p})}{n}}\) (8.6) proof |
\(z=\frac{\overline{p}-p_0}{\sqrt{\frac{p_0(1-p_0)}{n}}}\) (9.4) |
\(p_1-p_2\) | \((\overline{p}_1-\overline{p}_2)\pm z_{\alpha/2}\sqrt{\frac{\overline{p}_1(1-\overline{p}_1)}{n_1}+\frac{\overline{p}_2(1-\overline{p}_2)}{n_2}}\) (10.13) proof |
\(z=\frac{\overline{p}_1-\overline{p}_2}{\sqrt{\overline{p}(1-\overline{p})\left(\frac{1}{n_1}+\frac{1}{n_2}\right)}}\) (10.16) |
\(\sigma^2\) | \(\left[\frac{(n-1)s^2}{\chi_{\alpha/2}^2}, \frac{(n-1)s^2}{\chi_{1-\alpha/2}^2}\right]\) (11.7) proof |
\(\chi^2=\frac{(n-1)s^2}{\sigma_0^2}\) (11.8) |
\(\frac{\sigma_1^2}{\sigma_2^2}\) | \(\left[\frac{1}{\text{F}_{\alpha/2}(n-1, m-1)}\frac{s_1^2}{s_2^2}, \text{F}_{\alpha/2}(m-1, n-1)\frac{s_1^2}{s_2^2}\right]\) |
Motivation
我們會先有一個母體,還有母體平均 \(\mu\),例如全台灣人的平均身高,但我們不可能真的去調查每個人的身高,有可能調查完最後一個人,第一個人就長高了,所以我們只能抽樣某些人,調查這些人的身高,得到樣本平均 \(\bar{X}\),來推估母體平均。
既然是估計,難免會有誤差,但我們不能放任誤差任意大,我們得先設定一個我們可以接受的誤差範圍,也就是 \[ |\bar{X}-\mu|\leq \text{error} \]
既使限制了誤差的範圍,我們還是沒辦法保證誤差一定在這個範圍中,我們只能描述誤差在這個範圍的機率,並視需求調整這個機率,也就是 \[ P(|\bar{X}-\mu|\leq \text{error})=1-\alpha \] 寫成 \(1-\alpha\) 是統計學家們約定的,稱為信心係數。
我們先假設 \(X\sim \text{normal}(\mu, \sigma^2)\),所以 \(\frac{\bar{X}-\mu}{\sigma/\sqrt{n}}\sim \text{normal}(0, 1)\)。注意到 \[ P\left(-z_{\alpha/2}\leq \frac{\bar{X}-\mu}{\sigma/\sqrt{n}} \leq z_{\alpha/2}\right) =\int_{-z_{\alpha/2}}^{z_{\alpha/2}} \text{the pdf of normal(0,1)} =1-\alpha. \] 用圖形來看比較容易記。
所以我們把上面的式子做一點變化 \[ \begin{array}{ll} & P(|\bar{X}-\mu|\leq \text{error}) \\ = & P(-\text{error}\leq \bar{X}-\mu\leq \text{error}) \\ = & P(-\frac{\text{error}}{\sigma/\sqrt{n}}\leq \frac{\bar{X}-\mu}{\sigma/\sqrt{n}}\leq \frac{\text{error}}{\sigma/\sqrt{n}}) \\ = & P(-z_{\alpha/2}\leq \frac{\bar{X}-\mu}{\sigma/\sqrt{n}}\leq z_{\alpha/2}) \\ = & 1-\alpha \end{array} \]
能不能在probability function \(P\) 裡面任意將不等式變形這點還沒想清楚,但可以注意到不等式的等價變化 \[ \begin{array}{ll} & -z_{\alpha/2}\leq \frac{\bar{X}-\mu}{\sigma/\sqrt{n}}\leq z_{\alpha/2} \\ \Leftrightarrow & -z_{\alpha/2}\leq \frac{\mu-\bar{X}}{\sigma/\sqrt{n}}\leq z_{\alpha/2} \\ \Leftrightarrow & -z_{\alpha/2}\frac{\sigma}{\sqrt{n}}\leq \mu-\bar{X}\leq z_{\alpha/2}\frac{\sigma}{\sqrt{n}} \\ \Leftrightarrow & \bar{X}-z_{\alpha/2}\frac{\sigma}{\sqrt{n}}\leq \mu\leq \bar{X}+z_{\alpha/2}\frac{\sigma}{\sqrt{n}} \end{array} \]
注意到 \(z_{\alpha/2}=\frac{\text{error}}{\sigma/\sqrt{n}}\),或是寫成 \(\sqrt{n}\text{error}=z_{\alpha/2}\sigma\),所以
- 當 \(\text{error}\) 固定時,\(n\) 跟 \(z_{\alpha/2}\) 成正比。
- 當 \(n\) 固定時,\(\text{error}\) 跟 \(z_{\alpha/2}\) 成正比。
- 當 \(z_{\alpha/2}\) 固定時,\(\text{error}\) 跟 \(n\) 成反比。
常用的 \(1-\alpha\) 數值及其對應的 \(z_{\alpha/2}\) 如下 \[ \begin{array}{lll} 1-\alpha=0.90, & z_{\alpha/2}=1.645; \\ 1-\alpha=0.95, & z_{\alpha/2}=1.96; \\ 1-\alpha=0.99, & z_{\alpha/2}=2.576. \\ \end{array} \]
直觀解釋是,我們抽100次的樣本,然後算100次的樣本平均,然後得到100個信賴區間,則這100個區間裡面,大概會有95個區間包含母體平均。
這裡摘錄書上的說明:For a particular sample, this interval either does or does not contain the mean \(\mu\). However, if many such intervals were calculated, about \(90\%\) of them should contain the mean \(\mu\).
Confidence Intervals for Means, Variance is known
- 我們假設 \(X\sim \text{normal}(\mu, \sigma^2)\),所以 \(\frac{\bar{X}-\mu}{\sigma/\sqrt{n}}\sim \text{normal}(0, 1)\)。
- \[ P\left(-z_{\alpha/2}\leq \frac{\bar{X}-\mu}{\sigma/\sqrt{n}} \leq z_{\alpha/2}\right)=1-\alpha \]
- \[ \begin{array}{llll} & -z_{\alpha/2}\leq \frac{\bar{X}-\mu}{\sigma/\sqrt{n}} \leq z_{\alpha/2} \\ \Leftrightarrow & -z_{\alpha/2}\leq \frac{\mu-\bar{X}}{\sigma/\sqrt{n}} \leq z_{\alpha/2} \\ \Leftrightarrow & -z_{\alpha/2}\frac{\sigma}{\sqrt{n}}\leq \mu-\bar{X} \leq z_{\alpha/2}\frac{\sigma}{\sqrt{n}} \\ \Leftrightarrow & \bar{X}-z_{\alpha/2}\frac{\sigma}{\sqrt{n}}\leq \mu \leq \bar{X}+z_{\alpha/2}\frac{\sigma}{\sqrt{n}} \end{array} \]
Confidence Intervals for Means, Variance is unknown
- 因為 \(\sigma^2\) 未知,所以我們用 \(S^2\) 代替 \(\sigma^2\),於是 \[ \frac{\bar{X}-\mu}{S/\sqrt{n}} =\frac{\frac{\bar{X}-\mu}{\sigma/\sqrt{n}}}{S/\sigma} =\frac{\frac{\bar{X}-\mu}{\sigma/\sqrt{n}}}{\sqrt{\frac{\frac{(n-1)S^2}{\sigma^2}}{n-1}}} \stackrel{\text{Theorem 5.3.1, Definition 5.3.4}}{\sim} \text{t}(n-1) \]
- \[ P\left(-t_{\alpha/2}(n-1)\leq \frac{\bar{X}-\mu}{S/\sqrt{n}} \leq t_{\alpha/2}(n-1)\right)=1-\alpha \]
- \[ \begin{array}{ll} & -t_{\alpha/2}(n-1)\leq \frac{\bar{X}-\mu}{S/\sqrt{n}} \leq t_{\alpha/2}(n-1) \\ \Leftrightarrow & -t_{\alpha/2}(n-1)\leq \frac{\mu-\bar{X}}{S/\sqrt{n}} \leq t_{\alpha/2}(n-1) \\ \Leftrightarrow & -t_{\alpha/2}(n-1)\frac{S}{\sqrt{n}}\leq \mu-\bar{X} \leq t_{\alpha/2}(n-1)\frac{S}{\sqrt{n}} \\ \Leftrightarrow & \bar{X}-t_{\alpha/2}(n-1)\frac{S}{\sqrt{n}}\leq \mu \leq \bar{X}+t_{\alpha/2}(n-1)\frac{S}{\sqrt{n}} \end{array} \]
Confidence Intervals for the Difference of Two Means, Variances are Known
- \[ \begin{array}{cl} & X\sim \text{normal}(\mu_X, \sigma_X^2), Y\sim \text{normal}(\mu_Y, \sigma_Y^2) \\ \stackrel{\text{Theorem 5.3.1}}{\Rightarrow} & \bar{X}\sim \text{normal}(\mu_X, \frac{\sigma_X^2}{n}), \bar{Y}\sim \text{normal}(\mu_Y, \frac{\sigma_Y^2}{m}) \\ \stackrel{\text{Theorem 4.2.14, Theorem 2.3.4}}{\Rightarrow} & \bar{X}-\bar{Y}\sim \text{normal}(\mu_X-\mu_Y, \frac{\sigma_X^2}{n}+\frac{\sigma_Y^2}{m}) \\ \stackrel{\text{p.102, line -1}}{\Rightarrow} & \frac{(\bar{X}-\bar{Y})-(\mu_X-\mu_Y)}{\sqrt{\frac{\sigma_X^2}{n}+\frac{\sigma_Y^2}{m}}} \sim \text{normal}(0, 1) \end{array} \]
- \[ P\left(-z_{\alpha/2}\leq \frac{(\bar{X}-\bar{Y})-(\mu_X-\mu_Y)}{\sqrt{\frac{\sigma_X^2}{n}+\frac{\sigma_Y^2}{m}}} \leq z_{\alpha/2}\right)=1-\alpha \]
- \[ \begin{array}{ll} & -z_{\alpha/2}\leq \frac{(\bar{X}-\bar{Y})-(\mu_X-\mu_Y)}{\sqrt{\frac{\sigma_X^2}{n}+\frac{\sigma_Y^2}{m}}} \leq z_{\alpha/2} \\ \Leftrightarrow & -z_{\alpha/2}\leq \frac{(\mu_X-\mu_Y)-(\bar{X}-\bar{Y})}{\sqrt{\frac{\sigma_X^2}{n}+\frac{\sigma_Y^2}{m}}} \leq z_{\alpha/2} \\ \Leftrightarrow & -z_{\alpha/2}\sqrt{\frac{\sigma_X^2}{n}+\frac{\sigma_Y^2}{m}}\leq (\mu_X-\mu_Y)-(\bar{X}-\bar{Y}) \leq z_{\alpha/2}\sqrt{\frac{\sigma_X^2}{n}+\frac{\sigma_Y^2}{m}} \\ \Leftrightarrow & (\bar{X}-\bar{Y})-z_{\alpha/2}\sqrt{\frac{\sigma_X^2}{n}+\frac{\sigma_Y^2}{m}} \leq \mu_X-\mu_Y \leq (\bar{X}-\bar{Y})+z_{\alpha/2}\sqrt{\frac{\sigma_X^2}{n}+\frac{\sigma_Y^2}{m}} \end{array} \]
Confidence Intervals for the Difference of Two Means, Variances are Unknown and Equal
- \[ \begin{array}{cl} & X\sim \text{normal}(\mu_X, \sigma_X^2), Y\sim \text{normal}(\mu_Y, \sigma_Y^2) \\ \stackrel{\text{Theorem 5.3.1}}{\Rightarrow} & \bar{X}\sim \text{normal}(\mu_X, \frac{\sigma_X^2}{n}), \bar{Y}\sim \text{normal}(\mu_Y, \frac{\sigma_Y^2}{m}) \\ \stackrel{\text{Theorem 4.2.14, Theorem 2.3.4}}{\Rightarrow} & \bar{X}-\bar{Y}\sim \text{normal}(\mu_X-\mu_Y, \frac{\sigma_X^2}{n}+\frac{\sigma_Y^2}{m}) \\ \stackrel{\text{p.102, line -1}}{\Rightarrow} & \frac{(\bar{X}-\bar{Y})-(\mu_X-\mu_Y)}{\sqrt{\frac{\sigma_X^2}{n}+\frac{\sigma_Y^2}{m}}} \sim \text{normal}(0, 1) \end{array} \] Furthermore, \[ \begin{array}{cl} \text{By Theorem 5.3.1} & \frac{(n-1)S_X^2}{\sigma^2}\sim \chi_{n-1}^2, \frac{(m-1)S_Y^2}{\sigma^2}\sim \chi_{m-1}^2 \\ \stackrel{\text{Lemma 5.3.2}}{\Rightarrow} & \frac{(n-1)S_X^2}{\sigma^2}+\frac{(m-1)S_Y^2}{\sigma^2}\sim \chi_{n+m-2}^2 \\ \stackrel{\text{Definition 5.3.4}}{\Rightarrow} & \frac{\frac{(\bar{X}-\bar{Y})-(\mu_X-\mu_Y)}{\sqrt{\frac{\sigma_X^2}{n}+\frac{\sigma_Y^2}{m}}}}{\sqrt{\frac{\frac{(n-1)S_X^2}{\sigma^2}+\frac{(m-1)S_Y^2}{\sigma^2}}{n+m-2}}} \sim \text{t}(n+m-2) \\ \stackrel{\sigma_X=\sigma_Y=\sigma}{\Rightarrow} & \frac{(\bar{X}-\bar{Y})-(\mu_X-\mu_Y)}{\sqrt{\frac{(n-1)S_X^2+(m-1)S_Y^2}{n+m-2}}\sqrt{\frac{1}{n}+\frac{1}{m}}} \sim \text{t}(n+m-2) \end{array} \]
- \[ P\left(-t_{\alpha/2}(n+m-2) \leq \frac{(\bar{X}-\bar{Y})-(\mu_X-\mu_Y)}{\sqrt{\frac{(n-1)S_X^2+(m-1)S_Y^2}{n+m-2}}\sqrt{\frac{1}{n}+\frac{1}{m}}} \leq t_{\alpha/2}(n+m-2)\right)=1-\alpha \]
- \[ \begin{array}{ll} & -t_{\alpha/2}(n+m-2) \leq \frac{(\bar{X}-\bar{Y})-(\mu_X-\mu_Y)}{\sqrt{\frac{(n-1)S_X^2+(m-1)S_Y^2}{n+m-2}}\sqrt{\frac{1}{n}+\frac{1}{m}}} \leq t_{\alpha/2}(n+m-2) \\ \Leftrightarrow & -t_{\alpha/2}(n+m-2) \leq \frac{(\mu_X-\mu_Y)-(\bar{X}-\bar{Y})}{\sqrt{\frac{(n-1)S_X^2+(m-1)S_Y^2}{n+m-2}}\sqrt{\frac{1}{n}+\frac{1}{m}}} \leq t_{\alpha/2}(n+m-2) \\ \Leftrightarrow & -t_{\alpha/2}(n+m-2)\sqrt{\frac{(n-1)S_X^2+(m-1)S_Y^2}{n+m-2}}\sqrt{\frac{1}{n}+\frac{1}{m}} \leq (\mu_X-\mu_Y)-(\bar{X}-\bar{Y}) \leq t_{\alpha/2}(n+m-2)\sqrt{\frac{(n-1)S_X^2+(m-1)S_Y^2}{n+m-2}}\sqrt{\frac{1}{n}+\frac{1}{m}} \\ \Leftrightarrow & (\bar{X}-\bar{Y})-t_{\alpha/2}(n+m-2)\sqrt{\frac{(n-1)S_X^2+(m-1)S_Y^2}{n+m-2}}\sqrt{\frac{1}{n}+\frac{1}{m}} \leq \mu_X-\mu_Y \leq (\bar{X}+\bar{Y})+t_{\alpha/2}(n+m-2)\sqrt{\frac{(n-1)S_X^2+(m-1)S_Y^2}{n+m-2}}\sqrt{\frac{1}{n}+\frac{1}{m}} \end{array} \]
Confidence Intervals for the Difference of Two Means, Variances are Unknown and Nonequal
證明略。
Confidence Intervals for Variances
- By Theorem 5.3.1 \[ \frac{(n-1)S^2}{\sigma^2}\sim \chi^2(n-1) \]
- \[ P\left(\chi^2_{1-\alpha/2}(n-1) \leq \frac{(n-1)S^2}{\sigma^2} \leq \chi^2_{\alpha/2}(n-1)\right)=1-\alpha \]
- \[ \begin{array}{ll} & \chi^2_{1-\alpha/2}(n-1) \leq \frac{(n-1)S^2}{\sigma^2} \leq \chi^2_{\alpha/2}(n-1) \\ \Leftrightarrow & \frac{1}{\chi^2_{\alpha/2}(n-1)} \leq \frac{\sigma^2}{(n-1)S^2} \leq \frac{1}{\chi^2_{1-\alpha/2}(n-1)} \\ \Leftrightarrow & \frac{(n-1)S^2}{\chi^2_{\alpha/2}(n-1)} \leq \sigma^2 \leq \frac{(n-1)S^2}{\chi^2_{1-\alpha/2}(n-1)} \end{array} \]
Confidence Intervals for the Quotient of Two Variances
- \[ \begin{array}{cl} \text{By Theorem 5.3.1} & \frac{(m-1)S_Y^2}{\sigma_Y^2}\sim \chi^2_{m-1}, \frac{(n-1)S_X^2}{\sigma_X^2}\sim \chi^2_{n-1} \\ \stackrel{\text{p.225, line 1}}{\Rightarrow} & \frac{\frac{S_Y^2}{\sigma_Y^2}}{\frac{S_X^2}{\sigma_X^2}}=\frac{\left[\frac{(m-1)S_Y^2}{\sigma_Y^2}\right]/(m-1)}{\left[\frac{(n-1)S_X^2}{\sigma_X^2}\right]/(n-1)}\sim \text{F}(m-1, n-1) \end{array} \]
- \[ P\left(\text{F}_{1-\alpha/2}(m-1, n-1) \leq \frac{\frac{S_Y^2}{\sigma_Y^2}}{\frac{S_X^2}{\sigma_X^2}} \leq \text{F}_{\alpha/2}(m-1, n-1)\right)=1-\alpha \]
- 大部分書籍中,\(\text{F}\) distribution的表格沒有 \(1-\alpha/2\) 的值,所以我們要將 \(\text{F}_{1-\alpha/2}(m-1, n-1)\) 稍微變形一下。假設 \(W\sim \text{F}(m-1, n-1)\),於是 \[ \begin{array}{cl} & P(W\geq \text{F}_{1-\alpha/2}(m-1, n-1))=1-\alpha/2 \\ \Rightarrow & P\left(\frac{1}{W}\leq \frac{1}{\text{F}_{1-\alpha/2}(m-1, n-1)}\right)=1-\alpha/2 \\ \Rightarrow & 1-P\left(\frac{1}{W}\leq \frac{1}{\text{F}_{1-\alpha/2}(m-1, n-1)}\right)=\alpha/2 \\ \Rightarrow & P\left(\frac{1}{W}\geq \frac{1}{\text{F}_{1-\alpha/2}(m-1, n-1)}\right)=\alpha/2 \\ \stackrel{\text{Theorem 5.3.8, }\frac{1}{W}\sim \text{F}(n-1, m-1)}{\Rightarrow} & \frac{1}{\text{F}_{1-\alpha/2}(m-1, n-1)}=\text{F}_{\alpha/2}(n-1, m-1) \\ \Rightarrow & \text{F}_{1-\alpha/2}(m-1, n-1)=\frac{1}{\text{F}_{\alpha/2}(n-1, m-1)} \end{array} \] 所以原本的不等式可以改成 \[ \begin{array}{cl} & \frac{1}{\text{F}_{\alpha/2}(n-1, m-1)} \leq \frac{\frac{S_Y^2}{\sigma_Y^2}}{\frac{S_X^2}{\sigma_X^2}} \leq \text{F}_{\alpha/2}(m-1, n-1) \\ \Leftrightarrow & \frac{1}{\text{F}_{\alpha/2}(n-1, m-1)}\frac{S_X^2}{S_Y^2} \leq \frac{\sigma_X^2}{\sigma_Y^2} \leq \text{F}_{\alpha/2}(m-1, n-1)\frac{S_X^2}{S_Y^2} \end{array} \]
Confidence Intervals for Proportions
- \[ \begin{array}{cl} & Y\sim \text{binomial}(n, p) \\ \Rightarrow & Y=\sum_{i=1}^{n}X_i, \text{ where }X_i\sim \text{Bernoulli}(p) \\ \stackrel{\text{Central Limit Theorem}}{\Rightarrow} & \frac{\frac{Y}{n}-p}{\sqrt{\frac{p(1-p)}{n}}}=\frac{\bar{X}-p}{\sqrt{\frac{p(1-p)}{n}}} \sim \text{normal}(0, 1) \end{array} \]
- \[ P\left(-z_{\alpha/2}\leq \frac{\frac{Y}{n}-p}{\sqrt{\frac{p(1-p)}{n}}} \leq z_{\alpha/2}\right)=1-\alpha \]
- \[ \begin{array}{cl} & -z_{\alpha/2}\leq \frac{\frac{Y}{n}-p}{\sqrt{\frac{p(1-p)}{n}}} \leq z_{\alpha/2} \\ \Leftrightarrow & -z_{\alpha/2}\leq \frac{p-\frac{Y}{n}}{\sqrt{\frac{p(1-p)}{n}}} \leq z_{\alpha/2} \\ \Leftrightarrow & -z_{\alpha/2}\sqrt{\frac{p(1-p)}{n}} \leq p-\frac{Y}{n} \leq z_{\alpha/2}\sqrt{\frac{p(1-p)}{n}} \\ \Leftrightarrow & \frac{Y}{n}-z_{\alpha/2}\sqrt{\frac{p(1-p)}{n}} \leq p \leq \frac{Y}{n}+z_{\alpha/2}\sqrt{\frac{p(1-p)}{n}} \\ \stackrel{p\approx \frac{Y}{n}}{\Rightarrow} & \frac{Y}{n}-z_{\alpha/2}\sqrt{\frac{\frac{Y}{n}\left(1-\frac{Y}{n}\right)}{n}} \leq p \leq \frac{Y}{n}+z_{\alpha/2}\sqrt{\frac{\frac{Y}{n}\left(1-\frac{Y}{n}\right)}{n}} \end{array} \]
Confidence Intervals for the Difference of Two Proportions
- \[ \begin{array}{cl} & Y_1\sim \text{binomial}(n_1, p_1), Y_2\sim \text{binomial}(n_2, p_2) \\ \stackrel{\text{Central Limit Theorem}}{\Rightarrow} & \frac{\frac{Y_1}{n_1}-p_1}{\sqrt{\frac{p_1(1-p_1)}{n_1}}} \sim \text{normal}(0, 1), \frac{\frac{Y_2}{n_2}-p_2}{\sqrt{\frac{p_2(1-p_2)}{n_2}}} \sim \text{normal}(0, 1) \\ \Rightarrow & \frac{Y_1}{n_1}\sim \text{normal}(p_1, \frac{p_1(1-p_1)}{n_1}), \frac{Y_2}{n_2}\sim \text{normal}(p_2, \frac{p_2(1-p_2)}{n_2}) \\ \stackrel{\text{Theorem 4.2.14, Theorem 2.3.4}}{\Rightarrow} & \frac{Y_1}{n_1}-\frac{Y_2}{n_2}\sim \text{normal}(p_1-p_2, \frac{p_1(1-p_1)}{n_1}+\frac{p_2(1-p_2)}{n_2}) \\ \stackrel{\text{p.102, line -1}}{\Rightarrow} & \frac{\left(\frac{Y_1}{n_1}-\frac{Y_2}{n_2}\right)-(p_1-p_2)}{\sqrt{\frac{p_1(1-p_1)}{n_1}+\frac{p_2(1-p_2)}{n_2}}} \sim \text{normal}(0, 1) \end{array} \]
- \[ P\left(-z_{\alpha/2}\leq \frac{\left(\frac{Y_1}{n_1}-\frac{Y_2}{n_2}\right)-(p_1-p_2)}{\sqrt{\frac{p_1(1-p_1)}{n_1}+\frac{p_2(1-p_2)}{n_2}}} \leq z_{\alpha/2}\right)=1-\alpha \]
- \[ \begin{array}{cl} & -z_{\alpha/2}\leq \frac{\left(\frac{Y_1}{n_1}-\frac{Y_2}{n_2}\right)-(p_1-p_2)}{\sqrt{\frac{p_1(1-p_1)}{n_1}+\frac{p_2(1-p_2)}{n_2}}} \leq z_{\alpha/2} \\ \Leftrightarrow & -z_{\alpha/2}\leq \frac{(p_1-p_2)-\left(\frac{Y_1}{n_1}-\frac{Y_2}{n_2}\right)}{\sqrt{\frac{p_1(1-p_1)}{n_1}+\frac{p_2(1-p_2)}{n_2}}} \leq z_{\alpha/2} \\ \Leftrightarrow & -z_{\alpha/2}\sqrt{\frac{p_1(1-p_1)}{n_1}+\frac{p_2(1-p_2)}{n_2}} \leq (p_1-p_2)-\left(\frac{Y_1}{n_1}-\frac{Y_2}{n_2}\right) \leq z_{\alpha/2}\sqrt{\frac{p_1(1-p_1)}{n_1}+\frac{p_2(1-p_2)}{n_2}} \\ \Leftrightarrow & \left(\frac{Y_1}{n_1}-\frac{Y_2}{n_2}\right)-z_{\alpha/2}\sqrt{\frac{p_1(1-p_1)}{n_1}+\frac{p_2(1-p_2)}{n_2}} \leq p_1-p_2 \leq \left(\frac{Y_1}{n_1}-\frac{Y_2}{n_2}\right)+z_{\alpha/2}\sqrt{\frac{p_1(1-p_1)}{n_1}+\frac{p_2(1-p_2)}{n_2}} \end{array} \]
No comments:
Post a Comment