Summary of a fixest object. Computes different types of standard errors.

This function is similar to print.fixest. It provides the table of coefficients along with other information on the fit of the estimation. It can compute different types of standard errors. The new variance covariance matrix is an object returned.

Usage

# S3 method for class 'fixest'
summary(
  object,
  vcov = NULL,
  cluster = NULL,
  ssc = NULL,
  stage = NULL,
  lean = FALSE,
  agg = NULL,
  forceCovariance = FALSE,
  se = NULL,
  keepBounded = FALSE,
  n = 1000,
  vcov_fix = TRUE,
  nthreads = getFixest_nthreads(),
  ...
)

# S3 method for class 'fixest_list'
summary(
  object,
  se,
  cluster,
  ssc = getFixest_ssc(),
  vcov = NULL,
  stage = 2,
  lean = FALSE,
  n,
  ...
)

Arguments

object: A fixest object. Obtained using the functions femlm, feols or feglm.
vcov: Versatile argument to specify the VCOV. In general, it is either a character scalar equal to a VCOV type, either a formula of the form: vcov_type ~ variables. The VCOV types implemented are: "iid", "hetero" (or "HC1"), "cluster", "twoway", "NW" (or "newey_west"), "DK" (or "driscoll_kraay"), and "conley". It also accepts object from vcov_cluster, vcov_NW, NW, vcov_DK, DK, vcov_conley and conley. It also accepts covariance matrices computed externally. Finally it accepts functions to compute the covariances. See the vcov documentation in the vignette.
cluster: Tells how to cluster the standard-errors (if clustering is requested). Can be either a list of vectors, a character vector of variable names, a formula or an integer vector. Assume we want to perform 2-way clustering over var1 and var2 contained in the data.frame base used for the estimation. All the following cluster arguments are valid and do the same thing: cluster = base[, c("var1", "var2")], cluster = c("var1", "var2"), cluster = ~var1+var2. If the two variables were used as fixed-effects in the estimation, you can leave it blank with vcov = "twoway" (assuming var1 [resp. var2] was the 1st [resp. 2nd] fixed-effect). You can interact two variables using ^ with the following syntax: cluster = ~var1^var2 or cluster = "var1^var2".
ssc: An object of class ssc.type obtained with the function ssc. Represents how the degree of freedom correction should be done.You must use the function ssc for this argument. The arguments and defaults of the function ssc are: K.adj = TRUE, K.fixef = "nonnested", G.adj = TRUE, G.df = "min", t.df = "min", K.exact = FALSE). See the help of the function ssc for details.
stage: Can be equal to 2 (default), 1, 1:2 or 2:1. Only used if the object is an IV estimation: defines the stage to which summary should be applied. If stage = 1 and there are multiple endogenous regressors or if stage is of length 2, then an object of class fixest_multi is returned.
lean: Logical, default is FALSE. Used to reduce the (memory) size of the summary object. If TRUE, then all objects of length N (the number of observations) are removed from the result. Note that some fixest methods may consequently not work when applied to the summary.
agg: A character scalar describing the variable names to be aggregated, it is pattern-based. For sunab estimations, the following keywords work: "att", "period", "cohort" and FALSE (to have full disaggregation). All variables that match the pattern will be aggregated. It must be of the form "(root)", the parentheses must be there and the resulting variable name will be "root". You can add another root with parentheses: "(root1)regex(root2)", in which case the resulting name is "root1::root2". To name the resulting variable differently you can pass a named vector: c("name" = "pattern") or c("name" = "pattern(root2)"). It's a bit intricate sorry, please see the examples.
forceCovariance: (Advanced users.) Logical, default is FALSE. In the peculiar case where the obtained Hessian is not invertible (usually because of collinearity of some variables), use this option to force the covariance matrix, by using a generalized inverse of the Hessian. This can be useful to spot where possible problems come from.
se: Character scalar. Which kind of standard error should be computed: “standard”, “hetero”, “cluster”, “twoway”, “threeway” or “fourway”? By default if there are clusters in the estimation: se = "cluster", otherwise se = "iid". Note that this argument is deprecated, you should use vcov instead.
keepBounded: (Advanced users – feNmlm with non-linear part and bounded coefficients only.) Logical, default is FALSE. If TRUE, then the bounded coefficients (if any) are treated as unrestricted coefficients and their S.E. is computed (otherwise it is not).
n: Integer, default is 1000. Number of coefficients to display when the print method is used.
vcov_fix: Logical scalar, default is FALSE. If the VCOV ends up not being positive definite, whether to "fix" it using an eigenvalue decomposition (a la Cameron, Gelbach & Miller 2011). Since the VCOV should be PSD asymptotically, this might be a sign of a problem with using the asymptotic approximation (e.g. too few units in clusters). If a problem is detected, the function will print a message to inform you.
nthreads: The number of threads. Can be: a) an integer lower than, or equal to, the maximum number of threads; b) 0: meaning all available threads will be used; c) a number strictly between 0 and 1 which represents the fraction of all threads to use. The default is to use 50% of all threads. You can set permanently the number of threads used within this package using the function setFixest_nthreads.
...: Only used if the argument vcov is provided and is a function: extra arguments to be passed to that function.

Value

It returns a fixest object with:

cov.scaled: The new variance-covariance matrix (computed according to the argument se).
se: The new standard-errors (computed according to the argument se).
coeftable: The table of coefficients with the new standard errors.

Compatibility with sandwich package

The VCOVs from sandwich can be used with feols, feglm and fepois estimations. If you want to have a sandwich VCOV when using summary.fixest, you can use the argument vcov to specify the VCOV function to use (see examples). Note that if you do so and you use a formula in the cluster argument, an innocuous warning can pop up if you used several non-numeric fixed-effects in the estimation (this is due to the function expand.model.frame used in sandwich).

Author

Laurent Berge

Examples


# Load trade data
data(trade)

# We estimate the effect of distance on trade (with 3 fixed-effects)
est_pois = fepois(Euros ~ log(dist_km)|Origin+Destination+Product, trade)

# Comparing different types of standard errors
sum_standard = summary(est_pois, vcov = "iid")
sum_hetero   = summary(est_pois, vcov = "hetero")
sum_oneway   = summary(est_pois, vcov = "cluster")
sum_twoway   = summary(est_pois, vcov = "twoway")

etable(sum_standard, sum_hetero, sum_oneway, sum_twoway)
#>                        sum_standard         sum_hetero         sum_oneway
#> Dependent Var.:               Euros              Euros              Euros
#>                                                                          
#> log(dist_km)    -1.528*** (1.93e-6) -1.528*** (0.0220) -1.528*** (0.1156)
#> Fixed-Effects:  ------------------- ------------------ ------------------
#> Origin                          Yes                Yes                Yes
#> Destination                     Yes                Yes                Yes
#> Product                         Yes                Yes                Yes
#> _______________ ___________________ __________________ __________________
#> S.E. type                       IID Heteroskedas.-rob.         by: Origin
#> Observations                 38,325             38,325             38,325
#> Squared Cor.                0.60377            0.60377            0.60377
#> Pseudo R2                   0.76039            0.76039            0.76039
#> BIC                        1.43e+12           1.43e+12           1.43e+12
#> 
#>                         sum_twoway
#> Dependent Var.:              Euros
#>                                   
#> log(dist_km)    -1.528*** (0.1307)
#> Fixed-Effects:  ------------------
#> Origin                         Yes
#> Destination                    Yes
#> Product                        Yes
#> _______________ __________________
#> S.E. type        by: Orig. & Dest.
#> Observations                38,325
#> Squared Cor.               0.60377
#> Pseudo R2                  0.76039
#> BIC                       1.43e+12
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# Alternative ways to cluster the SE:
summary(est_pois, vcov = cluster ~ Product + Origin)
#> Poisson estimation, Dep. Var.: Euros
#> Observations: 38,325
#> Fixed-effects: Origin: 15,  Destination: 15,  Product: 20
#> Standard-errors: Clustered (Product & Origin) 
#>              Estimate Std. Error  z value  Pr(>|z|)    
#> log(dist_km) -1.52775   0.122773 -12.4437 < 2.2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> Log-Likelihood: -7.133e+11   Adj. Pseudo R2: 0.760389
#>            BIC:  1.427e+12     Squared Cor.: 0.60377 
summary(est_pois, vcov = ~Product + Origin)
#> Poisson estimation, Dep. Var.: Euros
#> Observations: 38,325
#> Fixed-effects: Origin: 15,  Destination: 15,  Product: 20
#> Standard-errors: Clustered (Product & Origin) 
#>              Estimate Std. Error  z value  Pr(>|z|)    
#> log(dist_km) -1.52775   0.122773 -12.4437 < 2.2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> Log-Likelihood: -7.133e+11   Adj. Pseudo R2: 0.760389
#>            BIC:  1.427e+12     Squared Cor.: 0.60377 
summary(est_pois, cluster = ~Product + Origin)
#> Poisson estimation, Dep. Var.: Euros
#> Observations: 38,325
#> Fixed-effects: Origin: 15,  Destination: 15,  Product: 20
#> Standard-errors: Clustered (Product & Origin) 
#>              Estimate Std. Error  z value  Pr(>|z|)    
#> log(dist_km) -1.52775   0.122773 -12.4437 < 2.2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> Log-Likelihood: -7.133e+11   Adj. Pseudo R2: 0.760389
#>            BIC:  1.427e+12     Squared Cor.: 0.60377 

# You can interact the clustering variables "live" using the var1 ^ var2 syntax.#'
summary(est_pois, vcov = ~Destination^Product)
#> Poisson estimation, Dep. Var.: Euros
#> Observations: 38,325
#> Fixed-effects: Origin: 15,  Destination: 15,  Product: 20
#> Standard-errors: Clustered (Destination^Product) 
#>              Estimate Std. Error  z value  Pr(>|z|)    
#> log(dist_km) -1.52775   0.072633 -21.0337 < 2.2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> Log-Likelihood: -7.133e+11   Adj. Pseudo R2: 0.760389
#>            BIC:  1.427e+12     Squared Cor.: 0.60377 

#
# Newey-West and Driscoll-Kraay SEs
#

data(base_did)
# Simple estimation on a panel
est = feols(y ~ x1, base_did)

# --
# Newey-West
# Use the syntax NW ~ unit + time
summary(est, NW ~ id + period)
#> OLS estimation, Dep. Var.: y
#> Observations: 1,080
#> Standard-errors: Newey-West (L=1) 
#>             Estimate Std. Error t value   Pr(>|t|)    
#> (Intercept) 1.988753   0.174111 11.4223 1.1709e-06 ***
#> x1          0.983110   0.052699 18.6551 1.6762e-08 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> RMSE: 4.89686   Adj. R2: 0.262357

# Now take a lag of 3:
summary(est, NW(3) ~ id + period)
#> OLS estimation, Dep. Var.: y
#> Observations: 1,080
#> Standard-errors: Newey-West (L=3) 
#>             Estimate Std. Error t value   Pr(>|t|)    
#> (Intercept) 1.988753   0.194500 10.2249 2.9725e-06 ***
#> x1          0.983110   0.051042 19.2610 1.2652e-08 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> RMSE: 4.89686   Adj. R2: 0.262357

# --
# Driscoll-Kraay
# Use the syntax DK ~ time
summary(est, DK ~ period)
#> OLS estimation, Dep. Var.: y
#> Observations: 1,080
#> Standard-errors: Driscoll-Kraay (L=1) 
#>             Estimate Std. Error  t value   Pr(>|t|)    
#> (Intercept) 1.988753   0.789538  2.51888 3.2829e-02 *  
#> x1          0.983110   0.036115 27.22141 5.9051e-10 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> RMSE: 4.89686   Adj. R2: 0.262357

# Now take a lag of 3:
summary(est, DK(3) ~ period)
#> OLS estimation, Dep. Var.: y
#> Observations: 1,080
#> Standard-errors: Driscoll-Kraay (L=3) 
#>             Estimate Std. Error  t value   Pr(>|t|)    
#> (Intercept) 1.988753   0.971486  2.04712 7.0943e-02 .  
#> x1          0.983110   0.028415 34.59840 6.9512e-11 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> RMSE: 4.89686   Adj. R2: 0.262357

#--
# Implicit deductions
# When the estimation is done with a panel.id, you don't need to
# specify these values.

est_panel = feols(y ~ x1, base_did, panel.id = ~id + period)

# Both methods, NM and DK, now work automatically
summary(est_panel, "NW")
#> OLS estimation, Dep. Var.: y
#> Observations: 1,080
#> Standard-errors: Newey-West (L=1) 
#>             Estimate Std. Error t value   Pr(>|t|)    
#> (Intercept) 1.988753   0.174111 11.4223 1.1709e-06 ***
#> x1          0.983110   0.052699 18.6551 1.6762e-08 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> RMSE: 4.89686   Adj. R2: 0.262357
summary(est_panel, "DK")
#> OLS estimation, Dep. Var.: y
#> Observations: 1,080
#> Standard-errors: Driscoll-Kraay (L=1) 
#>             Estimate Std. Error  t value   Pr(>|t|)    
#> (Intercept) 1.988753   0.789538  2.51888 3.2829e-02 *  
#> x1          0.983110   0.036115 27.22141 5.9051e-10 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> RMSE: 4.89686   Adj. R2: 0.262357

#
# VCOVs robust to spatial correlation
#

data(quakes)
est_geo = feols(depth ~ mag, quakes)

# --
# Conley
# Use the syntax: conley(cutoff) ~ lat + lon
# with lat/lon the latitude/longitude variable names in the data set
summary(est_geo, conley(100) ~ lat + long)
#> OLS estimation, Dep. Var.: depth
#> Observations: 1,000
#> Standard-errors: Conley (100km) 
#>             Estimate Std. Error  t value   Pr(>|t|)    
#> (Intercept)  881.625   108.9005  8.09569 1.6480e-15 ***
#> mag         -123.421    19.2323 -6.41737 2.1389e-10 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> RMSE: 209.6   Adj. R2: 0.052245

# Change the cutoff, and how the distance is computed
summary(est_geo, conley(200, distance = "spherical") ~ lat + long)
#> OLS estimation, Dep. Var.: depth
#> Observations: 1,000
#> Standard-errors: Conley (200km) 
#>             Estimate Std. Error  t value   Pr(>|t|)    
#> (Intercept)  881.625   128.2426  6.87467 1.0937e-11 ***
#> mag         -123.421    22.8950 -5.39074 8.7582e-08 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> RMSE: 209.6   Adj. R2: 0.052245

# --
# Implicit deduction
# By default the latitude and longitude are directly fetched in the data based
# on pattern matching. So you don't have to specify them.
# Further an automatic cutoff is computed by default.

# The following works
summary(est_geo, "conley")
#> OLS estimation, Dep. Var.: depth
#> Observations: 1,000
#> Standard-errors: Conley (90km) 
#>             Estimate Std. Error  t value   Pr(>|t|)    
#> (Intercept)  881.625   110.6727  7.96606 4.4465e-15 ***
#> mag         -123.421    20.1746 -6.11765 1.3619e-09 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> RMSE: 209.6   Adj. R2: 0.052245



#
# Compatibility with sandwich
#

# You can use the VCOVs from sandwich by using the argument vcov:
library(sandwich)
summary(est_pois, vcov = vcovCL, cluster = trade[, c("Destination", "Product")])
#> Poisson estimation, Dep. Var.: Euros
#> Observations: 38,325
#> Fixed-effects: Origin: 15,  Destination: 15,  Product: 20
#> Standard-errors: vcovCL 
#>              Estimate Std. Error  z value  Pr(>|z|)    
#> log(dist_km) -1.52775   0.120014 -12.7298 < 2.2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> Log-Likelihood: -7.133e+11   Adj. Pseudo R2: 0.760389
#>            BIC:  1.427e+12     Squared Cor.: 0.60377

Summary of a `fixest` object. Computes different types of standard errors.