reg api00 ell meals mobility cname [pweight = pw]
## . sum of wgt is 6.1940e+03)
## (
## of obs = 200
## Linear regression Number F( 4, 195) = 104.68
## F = 0.0000
## Prob >
## R-squared = 0.6601
## Root MSE = 72.589
##
## ------------------------------------------------------------------------------
## | Robust
## api00 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
## -------------+----------------------------------------------------------------
## ell | -.513901 .4063997 -1.26 0.208 -1.315404 .287602
## meals | -3.148314 .2925792 -10.76 0.000 -3.72534 -2.571288
## mobility | .2346743 .4047053 0.58 0.563 -.5634871 1.032836
## |
## cname |
## Group2 | -9.708186 19.92028 -0.49 0.627 -48.99504 29.57867_cons | 830.4303 21.18687 39.20 0.000 788.6455 872.2152
##
## ------------------------------------------------------------------------------
## using mod1.txt, cells("b se t p") stats(N) replace
## . estout . note: file mod1.txt not found)
## (
## (output written to mod1.txt)
## estimates store t1 ## .
R vs. STATA
R vs. STATA
There apparently are differences between pweight
and aweight
in STATA, and weights
in R (for instance in the glm
function). To summarize:
- In STATA,
pweight
(probability weights) is equivalent toaweight
(analytic weights) with robust standard errors. aweight
is equivalent toweights
inglm
.pweight
is equivalent toweights
inglm
with robust standard errors.
Here I provide a numerical example to show that indeed the use of robust standard errors leads to results identical to those obtained in STATA with pweight
.
A numerical example
We will use the same data as here, so that we can match the results obtained in R with those obtained in STATA.
The dataset df
contains a column, pw
, which represents our weights. The aim is to show that by calculating robust standard errors in R, we obtain the same results as those in STATA when using pweight
(or aweight
and robust standard errors).
The results for STATA are shown below:
We will now fit a simple weighted linear model.
<- glm(
mod ~ ell + meals + mobility + cname,
api00 data = df,
weights = pw
)
As you can see in Table 1, the standard errors differ.
term | R1 | STATA1 |
---|---|---|
ell | 0.3721 | 0.4064 |
meals | 0.2701 | 0.2926 |
mobility | 0.4629 | 0.4047 |
cnameGroup2 | 16.8738 | 19.9203 |
1 Standard errors |
We will now estimate the standard errors using the sandwich
R package.
<- lmtest::coeftest(
robust_se
mod,vcov. = sandwich::vcovHC(mod, type = "HC1")
)
term | R1 | STATA1 |
---|---|---|
ell | 0.4064 | 0.4064 |
meals | 0.2926 | 0.2926 |
mobility | 0.4047 | 0.4047 |
cnameGroup2 | 19.9203 | 19.9203 |
1 Standard errors |
As you can see in Table 2, now the standard errors match.