This function sets globally the default arguments of fixest estimations.
Usage
setFixest_estimation(
data = NULL,
panel.id = NULL,
fixef.rm = "perfect",
fixef.tol = 1e-06,
fixef.iter = 10000,
collin.tol = 1e-10,
lean = FALSE,
verbose = 0,
warn = TRUE,
combine.quick = NULL,
demeaned = FALSE,
mem.clean = FALSE,
glm.iter = 25,
glm.tol = 1e-08,
data.save = FALSE,
reset = FALSE
)
getFixest_estimation()
Arguments
- data
A data.frame containing the necessary variables to run the model. The variables of the non-linear right hand side of the formula are identified with this
data.frame
names. Can also be a matrix.- panel.id
The panel identifiers. Can either be: i) a one sided formula (e.g.
panel.id = ~id+time
), ii) a character vector of length 2 (e.g.panel.id=c('id', 'time')
, or iii) a character scalar of two variables separated by a comma (e.g.panel.id='id,time'
). Note that you can combine variables with^
only inside formulas (see the dedicated section infeols
).- fixef.rm
Can be equal to "perfect" (default), "singleton", "both" or "none". Controls which observations are to be removed. If "perfect", then observations having a fixed-effect with perfect fit (e.g. only 0 outcomes in Poisson estimations) will be removed. If "singleton", all observations for which a fixed-effect appears only once will be removed. Note, importantly, that singletons are removed in just one pass, there is no recursivity implemented. The meaning of "both" and "none" is direct.
- fixef.tol
Precision used to obtain the fixed-effects. Defaults to
1e-5
. It corresponds to the maximum absolute difference allowed between two coefficients of successive iterations. Argumentfixef.tol
cannot be lower than10000*.Machine$double.eps
. Note that this parameter is dynamically controlled by the algorithm.- fixef.iter
Maximum number of iterations in fixed-effects algorithm (only in use for 2+ fixed-effects). Default is 10000.
- collin.tol
Numeric scalar, default is
1e-10
. Threshold deciding when variables should be considered collinear and subsequently removed from the estimation. Higher values means more variables will be removed (if there is presence of collinearity). One signal of presence of collinearity is t-stats that are extremely low (for instance when t-stats < 1e-3).- lean
Logical, default is
FALSE
. IfTRUE
then all large objects are removed from the returned result: this will save memory but will block the possibility to use many methods. It is recommended to use the argumentsse
orcluster
to obtain the appropriate standard-errors at estimation time, since obtaining different SEs won't be possible afterwards.- verbose
Integer. Higher values give more information. In particular, it can detail the number of iterations in the demeaning algorithm (the first number is the left-hand-side, the other numbers are the right-hand-side variables).
- warn
Logical, default is
TRUE
. Whether warnings should be displayed (concerns warnings relating to convergence state).- combine.quick
Logical. When you combine different variables to transform them into a single fixed-effects you can do e.g.
y ~ x | paste(var1, var2)
. The algorithm provides a shorthand to do the same operation:y ~ x | var1^var2
. Because pasting variables is a costly operation, the internal algorithm may use a numerical trick to hasten the process. The cost of doing so is that you lose the labels. If you are interested in getting the value of the fixed-effects coefficients after the estimation, you should usecombine.quick = FALSE
. By default it is equal toFALSE
if the number of observations is lower than 50,000, and toTRUE
otherwise.- demeaned
Logical, default is
FALSE
. Only used in the presence of fixed-effects: should the centered variables be returned? IfTRUE
, it creates the itemsy_demeaned
andX_demeaned
.- mem.clean
Logical, default is
FALSE
. Only to be used if the data set is large compared to the available RAM. IfTRUE
then intermediary objects are removed as much as possible andgc
is run before each substantial C++ section in the internal code to avoid memory issues.- glm.iter
Number of iterations of the glm algorithm. Default is 25.
- glm.tol
Tolerance level for the glm algorithm. Default is
1e-8
.- data.save
Logical scalar, default is
FALSE
. IfTRUE
, the data used for the estimation is saved within the returned object. Hence later calls to predict(), vcov(), etc..., will be consistent even if the original data has been modified in the meantime. This is especially useful for estimations within loops, where the data changes at each iteration, such that postprocessing can be done outside the loop without issue.- reset
Logical scalar, default is
FALSE
. Whether to reset all values.
Examples
#
# Example: removing singletons is FALSE by default
#
# => changing this default
# Let's create data with singletons
base = iris
names(base) = c("y", "x1", "x2", "x3", "species")
base$fe_singletons = as.character(base$species)
base$fe_singletons[1:5] = letters[1:5]
res = feols(y ~ x1 + x2 | fe_singletons, base)
res_noSingle = feols(y ~ x1 + x2 | fe_singletons, base, fixef.rm = "single")
#> NOTE: 5 fixed-effect singleton was removed (5 observations, breakup: 5).
# New defaults
setFixest_estimation(fixef.rm = "single")
res_newDefault = feols(y ~ x1 + x2 | fe_singletons, base)
#> NOTE: 5 fixed-effect singleton was removed (5 observations, breakup: 5).
etable(res, res_noSingle, res_newDefault)
#> res res_noSingle res_newDefault
#> Dependent Var.: y y y
#>
#> x1 0.4274* (0.1409) 0.4274 (0.1615) 0.4274 (0.1615)
#> x2 0.7774*** (0.1099) 0.7774* (0.1260) 0.7774* (0.1260)
#> Fixed-Effects: ------------------ ---------------- ----------------
#> fe_singletons Yes Yes Yes
#> _______________ __________________ ________________ ________________
#> S.E.: Clustered by: fe_singletons by: fe_singlet.. by: fe_singlet..
#> Observations 150 145 145
#> R2 0.86452 0.85729 0.85729
#> Within R2 0.64201 0.64201 0.64201
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Resetting the defaults
setFixest_estimation(reset = TRUE)