Simple Linear Regression

Simple Linear Regression

Simple Linear Regression

Table of Contents

Formulas

The page numbers refer to Anderson's Statistics for Business and Economics.

ˆy=b0+b1x p.656, (14.3).
b1=ni=1(xi¯x)(yi¯y)ni=1(xi¯x)2 p.660, (14.7).
b0=¯yb1¯x p.660, (14.6). Note that this formula is very similar to ˆy=b0+b1x.
SSE=ni=1(yiˆyi)2 p.668, (14.8).
SSR=ni=1(ˆyi¯y)2 p.669, (14.10).
SST=ni=1(yi¯y)2 p.669, (14.9).
SSE+SSR=SST p.679, (14.11).
ˆσ=SSEn2 p.677, (14.16).
ˆσb1=ˆσni=1(xi¯x)2 p.678, (14.18).
t=b1β1ˆσb1H0:β1=0=b1ˆσb1 p.679, (14.19).
r2=SSRSST p.671, (14.12).
f=SSR/1SSE/(n2) p.680, (14.21).
Your browser does not support the HTML5 canvas tag.
sx=(xiˉx)2n1 (3.8), (3.9).
sy=(yiˉy)2n1 (3.8), (3.9).
sxy=(xiˉx)(yiˉy)n1 (3.13).
rxy=sxysxsy (3.15).

Relationships among SSE, SSR, SST and sx,sy,sxy

ˆσ,ˆσb1 沒列入。SSE=(n1)(s2xs2ys2xys2x) 也沒列入,但不需要。

Simple Linear Regression in Statistics and Least Square Line in Linear Algebra

雖然統計學中的迴歸直線線性代數中的最小平方直線兩者的公式看起來不一樣,但在這裡證明其實是相同的。

A=[x11x21xn1]AtA=[x1x2xn111][x11x21xn1]=[x2ixixin](AtA)1=1nx2i(xi)2[nxixix2i](AtA)1At=1nx2i(xi)2[nxixix2i][x1x2xn111]=1nx2i(xi)2[nx1xinx2xinxnxix1xi+x2ix2xi+x2ixnxi+x2i]=1nx2in2ˉx2[nx1nˉxnx2nˉxnxnnˉxx1nˉx+nx2inx2nˉx+nx2inxnnˉx+nx2in]=1x2inˉx2[x1ˉxx2ˉxxnˉxx1ˉx+x2inx2ˉx+x2inxnˉx+x2in](AtA)1Aty=1x2inˉx2[x1ˉxx2ˉxxnˉxx1ˉx+x2inx2ˉx+x2inxnˉx+x2in][y1y2yn]=1x2inˉx2[(xiˉx)yi(xiˉx+x2in)yi]

Let us see the first component. Note that (xiˉx)2=x2i2xiˉx+ˉx2=x2i2nˉx2+nˉx2=x2inˉx2
and (xiˉx)(yiˉy)=(xiˉx)yi(xiˉxˉy=(xiˉx)yi.
So the first component is b1=(xiˉx)yix2inˉx2=(xiˉx)(yiˉy)(xiˉx)2.
The second component is (xiˉx+x2in)yix2inˉx2=ˉyx2iˉxxiyix2inˉx2.
We verify that b0 equals the second component. b0=ˉyb1ˉx=ˉy(xiˉx)yix2inˉx2ˉx=ˉy(x2inˉx2)(xiˉx)yiˉxx2inˉx2=ˉyx2inˉx2ˉyˉxxiyi+ˉx2nˉyx2inˉx2=ˉyx2iˉxxiyix2inˉx2

Simple Linear Regression in R---Syntax

x=c(x, ...)
y=c(y, ...)
model=lm(y~x)
summary(model)

更多模型的指令參考這裡,備份如下

SyntaxModel
y~xy=β0+β1x
y~x+I(x^2)y=β0+β1x+β2x2
y~x1+x2y=β0+β1x1+β2x2
y~x1*x2y=β0+β1x1+β2x2+β3x1x2

Simple Linear Regression in R---Results

Coefficients:
EstimateStd. Errort valuePr(>|t|)
(Intercept)b0=¯yb1¯x***
xb1=ni=1(xi¯x)(yi¯y)ni=1(xi¯x)2ˆσb1=ˆσni=1(xi¯x)2t=b1ˆσb1p-value associated with t=2(1T(t))***

---
Signif. codes: 解釋上面的***
Residual standard error: ˆσ=SSEn2 on n2 degree of freedom
Multiple R-squared: r2=SSRSST, Adjusted R-squared:
F-statistic: f=SSR/1SSE/(n2), p-value: p-value associated with f=1F(f)

The p-values associated with t is 2(1T(t)), where T is the cumulative distribution function of t(n1).

The p-values associated with f is 1F(f), where F is the cumulative distribution function of F(1,n2).

Computing F distribution and T distribution

可以利用Wolfram Alpha輸入下面指令。

CDF[FRatioDistribution[n, m], x]
CDF[StudentTDistribution[n], x]

No comments:

Post a Comment