This is a function advanced users which allows to estimate any fixest
estimation from a
fixest
environment obtained with only.env = TRUE
in a fixest
estimation.
Arguments
- env
An environment obtained from a
fixest
estimation withonly.env = TRUE
. This is intended for advanced users so there is no error handling: any other kind of input will fail with a poor error message.- y
A vector representing the dependent variable. Should be of the same length as the number of observations in the initial estimation.
- X
A matrix representing the independent variables. Should be of the same dimension as in the initial estimation.
- weights
A vector of weights (i.e. with only positive values). Should be of the same length as the number of observations in the initial estimation. If identical to the scalar 1, this will mean that no weights will be used in the estimation.
- endo
A matrix representing the endogenous regressors in IV estimations. It should be of the same dimension as the original endogenous regressors.
- inst
A matrix representing the instruments in IV estimations. It should be of the same dimension as the original instruments.
Value
It returns the results of a fixest
estimation: the one that was summoned when
obtaining the environment.
Details
This function has been created for advanced users, mostly to avoid overheads
when making simulations with fixest
.
How can it help you make simulations? First make a core estimation with only.env = TRUE
,
and usually with only.coef = TRUE
(to avoid having extra things that take time to compute).
Then loop while modifying the appropriate things directly in the environment. Beware that
if you make a mistake here (typically giving stuff of the wrong length),
then you can make the R session crash because there is no more error-handling!
Finally estimate with est_env(env = core_env)
and store the results.
Instead of est_env
, you could use directly fixest
estimations too, like feols
,
since they accept the env
argument. The function est_env
is only here to add a
bit of generality to avoid the trouble to the user to write conditions
(look at the source, it's just a one liner).
Objects of main interest in the environment are:
- lhs
The left hand side, or dependent variable.
- linear.mat
The matrix of the right-hand-side, or explanatory variables.
- iv_lhs
The matrix of the endogenous variables in IV regressions.
- iv.mat
The matrix of the instruments in IV regressions.
- weights.value
The vector of weights.
I strongly discourage changing the dimension of any of these elements, or else crash can occur.
However, you can change their values at will (given the dimension stay the same).
The only exception is the weights, which tolerates changing its dimension: it can
be identical to the scalar 1
(meaning no weights), or to something of the length the
number of observations.
I also discourage changing anything in the fixed-effects (even their value) since this will almost surely lead to a crash.
Note that this function is mostly useful when the overheads/estimation ratio is high. This means that OLS will benefit the most from this function. For GLM/Max.Lik. estimations, the ratio is small since the overheads is only a tiny portion of the total estimation time. Hence this function will be less useful for these models.
Examples
# Let's make a short simulation
# Inspired from Grant McDermott bboot function
# See https://twitter.com/grant_mcdermott/status/1487528757418102787
# Simple function that computes a Bayesian bootstrap
bboot = function(x, n_sim = 100){
# We bootstrap on the weights
# Works with fixed-effects/IVs
# and with any fixest function that accepts weights
core_env = update(x, only.coef = TRUE, only.env = TRUE)
n_obs = x$nobs
res_all = vector("list", n_sim)
for(i in 1:n_sim){
## begin: NOT RUN
## We could directly assign in the environment:
# assign("weights.value", rexp(n_obs, rate = 1), core_env)
# res_all[[i]] = est_env(env = core_env)
## end: NOT RUN
## Instead we can use the argument weights, which does the same
res_all[[i]] = est_env(env = core_env, weights = rexp(n_obs, rate = 1))
}
do.call(rbind, res_all)
}
est = feols(mpg ~ wt + hp, mtcars)
boot_res = bboot(est)
coef = colMeans(boot_res)
std_err = apply(boot_res, 2, sd)
# Comparing the results with the main estimation
coeftable(est)
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 37.22727012 1.59878754 23.284689 2.565459e-20
#> wt -3.87783074 0.63273349 -6.128695 1.119647e-06
#> hp -0.03177295 0.00902971 -3.518712 1.451229e-03
#> attr(,"type")
#> [1] "IID"
cbind(coef, std_err)
#> coef std_err
#> (Intercept) 37.06416839 1.878346352
#> wt -3.82426295 0.613391938
#> hp -0.03262888 0.006218552