Estimates a fixest estimation from a fixest environment

This is a function advanced users which allows to estimate any fixest estimation from a fixest environment obtained with only.env = TRUE in a fixest estimation.

Usage

est_env(env, y, X, weights, endo, inst)

Arguments

env: An environment obtained from a fixest estimation with only.env = TRUE. This is intended for advanced users so there is no error handling: any other kind of input will fail with a poor error message.
y: A vector representing the dependent variable. Should be of the same length as the number of observations in the initial estimation.
X: A matrix representing the independent variables. Should be of the same dimension as in the initial estimation.
weights: A vector of weights (i.e. with only positive values). Should be of the same length as the number of observations in the initial estimation. If identical to the scalar 1, this will mean that no weights will be used in the estimation.
endo: A matrix representing the endogenous regressors in IV estimations. It should be of the same dimension as the original endogenous regressors.
inst: A matrix representing the instruments in IV estimations. It should be of the same dimension as the original instruments.

Value

It returns the results of a fixest estimation: the one that was summoned when obtaining the environment.

Details

This function has been created for advanced users, mostly to avoid overheads when making simulations with fixest.

How can it help you make simulations? First make a core estimation with only.env = TRUE, and usually with only.coef = TRUE (to avoid having extra things that take time to compute). Then loop while modifying the appropriate things directly in the environment. Beware that if you make a mistake here (typically giving stuff of the wrong length), then you can make the R session crash because there is no more error-handling! Finally estimate with est_env(env = core_env) and store the results.

Instead of est_env, you could use directly fixest estimations too, like feols, since they accept the env argument. The function est_env is only here to add a bit of generality to avoid the trouble to the user to write conditions (look at the source, it's just a one liner).

Objects of main interest in the environment are:

lhs: The left hand side, or dependent variable.
linear.mat: The matrix of the right-hand-side, or explanatory variables.
iv_lhs: The matrix of the endogenous variables in IV regressions.
iv.mat: The matrix of the instruments in IV regressions.
weights.value: The vector of weights.

I strongly discourage changing the dimension of any of these elements, or else crash can occur. However, you can change their values at will (given the dimension stay the same). The only exception is the weights, which tolerates changing its dimension: it can be identical to the scalar 1 (meaning no weights), or to something of the length the number of observations.

I also discourage changing anything in the fixed-effects (even their value) since this will almost surely lead to a crash.

Note that this function is mostly useful when the overheads/estimation ratio is high. This means that OLS will benefit the most from this function. For GLM/Max.Lik. estimations, the ratio is small since the overheads is only a tiny portion of the total estimation time. Hence this function will be less useful for these models.

Author

Laurent Berge

Examples


# Let's make a short simulation
# Inspired from Grant McDermott bboot function
# See https://twitter.com/grant_mcdermott/status/1487528757418102787

# Simple function that computes a Bayesian bootstrap
bboot = function(x, n_sim = 100){
  # We bootstrap on the weights
  # Works with fixed-effects/IVs
  #  and with any fixest function that accepts weights

  core_env = update(x, only.coef = TRUE, only.env = TRUE)
  n_obs = x$nobs

  res_all = vector("list", n_sim)
  for(i in 1:n_sim){
    ## begin: NOT RUN
    ## We could directly assign in the environment:
    # assign("weights.value", rexp(n_obs, rate = 1), core_env)
    # res_all[[i]] = est_env(env = core_env)
    ##   end: NOT RUN

    ## Instead we can use the argument weights, which does the same
    res_all[[i]] = est_env(env = core_env, weights = rexp(n_obs, rate = 1))
  }

  do.call(rbind, res_all)
}


est = feols(mpg ~ wt + hp, mtcars)

boot_res = bboot(est)
coef = colMeans(boot_res)
std_err = apply(boot_res, 2, sd)

# Comparing the results with the main estimation
coeftable(est)
#>                Estimate Std. Error   t value     Pr(>|t|)
#> (Intercept) 37.22727012 1.59878754 23.284689 2.565459e-20
#> wt          -3.87783074 0.63273349 -6.128695 1.119647e-06
#> hp          -0.03177295 0.00902971 -3.518712 1.451229e-03
#> attr(,"type")
#> [1] "IID"
cbind(coef, std_err)
#>                    coef     std_err
#> (Intercept) 37.06416839 1.878346352
#> wt          -3.82426295 0.613391938
#> hp          -0.03262888 0.006218552