Computes various fit statistics for `fixest`

estimations.

## Usage

```
fitstat(
x,
type,
simplify = FALSE,
verbose = TRUE,
show_types = FALSE,
frame = parent.frame(),
...
)
```

## Arguments

- x
A

`fixest`

estimation.- type
Character vector or one sided formula. The type of fit statistic to be computed. The classic ones are: n, rmse, r2, pr2, f, wald, ivf, ivwald. You have the full list in the details section or use

`show_types = TRUE`

. Further, you can register your own types with`fitstat_register`

.- simplify
Logical, default is

`FALSE`

. By default a list is returned whose names are the selected types. If`simplify = TRUE`

and only one type is selected, then the element is directly returned (ie will not be nested in a list).- verbose
Logical, default is

`TRUE`

. If`TRUE`

, an object of class`fixest_fitstat`

is returned (so its associated print method will be triggered). If`FALSE`

a simple list is returned instead.- show_types
Logical, default is

`FALSE`

. If`TRUE`

, only prompts all available types.- frame
An environment in which to evaluate variables, default is

`parent.frame()`

. Only used if the argument`type`

is a formula and some values in the formula have to be extended with the dot square bracket operator. Mostly for internal use.- ...
Other elements to be passed to other methods and may be used to compute the statistics (for example you can pass on arguments to compute the VCOV when using

`type = "g"`

or`type = "wald"`

.).

## Value

By default an object of class `fixest_fitstat`

is returned. Using `verbose = FALSE`

returns a simple a list. Finally, if only one type is selected, `simplify = TRUE`

leads to the selected type to be returned.

## Registering your own types

You can register custom fit statistics with the function `fitstat_register`

.

## Available types

The types are case sensitive, please use lower case only. The types available are:

`n`

,`ll`

,`aic`

,`bic`

,`rmse`

:The number of observations, the log-likelihood, the AIC, the BIC and the root mean squared error, respectively.

`my`

:Mean of the dependent variable.

`g`

:The degrees of freedom used to compute the t-test (it influences the p-values of the coefficients). When the VCOV is clustered, this value is equal to the minimum cluster size, otherwise, it is equal to the sample size minus the number of variables.

`r2`

,`ar2`

,`wr2`

,`awr2`

,`pr2`

,`apr2`

,`wpr2`

,`awpr2`

:All r2 that can be obtained with the function

`r2`

. The`a`

stands for 'adjusted', the`w`

for 'within' and the`p`

for 'pseudo'. Note that the order of the letters`a`

,`w`

and`p`

does not matter. The pseudo R2s are McFadden's R2s (ratios of log-likelihoods).`theta`

:The over-dispersion parameter in Negative Binomial models. Low values mean high overdispersion.

`f`

,`wf`

:The F-tests of nullity of the coefficients. The

`w`

stands for 'within'. These types return the following values:`stat`

,`p`

,`df1`

and`df2`

. If you want to display only one of these, use their name after a dot: e.g.`f.stat`

will give the statistic of the F-test, or`wf.p`

will give the p-values of the F-test on the projected model (i.e. projected onto the fixed-effects).`wald`

:Wald test of joint nullity of the coefficients. This test always excludes the intercept and the fixed-effects. These type returns the following values:

`stat`

,`p`

,`df1`

,`df2`

and`vcov`

. The element`vcov`

reports the way the VCOV matrix was computed since it directly influences this statistic.`ivf`

,`ivf1`

,`ivf2`

,`ivfall`

:These statistics are specific to IV estimations. They report either the IV F-test (namely the Cragg-Donald F statistic in the presence of only one endogenous regressor) of the first stage (

`ivf`

or`ivf1`

), of the second stage (`ivf2`

) or of both (`ivfall`

). The F-test of the first stage is commonly named weak instrument test. The value of`ivfall`

is only useful in`etable`

when both the 1st and 2nd stages are displayed (it leads to the 1st stage F-test(s) to be displayed on the 1st stage estimation(s), and the 2nd stage one on the 2nd stage estimation -- otherwise,`ivf1`

would also be displayed on the 2nd stage estimation). These types return the following values:`stat`

,`p`

,`df1`

and`df2`

.`ivwald`

,`ivwald1`

,`ivwald2`

,`ivwaldall`

:These statistics are specific to IV estimations. They report either the IV Wald-test of the first stage (

`ivwald`

or`ivwald1`

), of the second stage (`ivwald2`

) or of both (`ivwaldall`

). The Wald-test of the first stage is commonly named weak instrument test. Note that if the estimation was done with a robust VCOV and there is only one endogenous regressor, this is equivalent to the Kleibergen-Paap statistic. The value of`ivwaldall`

is only useful in`etable`

when both the 1st and 2nd stages are displayed (it leads to the 1st stage Wald-test(s) to be displayed on the 1st stage estimation(s), and the 2nd stage one on the 2nd stage estimation -- otherwise,`ivwald1`

would also be displayed on the 2nd stage estimation). These types return the following values:`stat`

,`p`

,`df1`

,`df2`

, and`vcov`

.`cd`

:The Cragg-Donald test for weak instruments.

`kpr`

:The Kleibergen-Paap test for weak instruments.

`wh`

:This statistic is specific to IV estimations. Wu-Hausman endogeneity test. H0 is the absence of endogeneity of the instrumented variables. It returns the following values:

`stat`

,`p`

,`df1`

,`df2`

.`sargan`

:Sargan test of overidentifying restrictions. H0: the instruments are not correlated with the second stage residuals. It returns the following values:

`stat`

,`p`

,`df`

.`lr`

,`wlr`

:Likelihood ratio and within likelihood ratio tests. It returns the following elements:

`stat`

,`p`

,`df`

. Concerning the within-LR test, note that, contrary to estimations with`femlm`

or`feNmlm`

, estimations with`feglm`

/`fepois`

need to estimate the model with fixed-effects only which may prove time-consuming (depending on your model). Bottom line, if you really need the within-LR and estimate a Poisson model, use`femlm`

instead of`fepois`

(the former uses direct ML maximization for which the only FEs model is a by product).

## Examples

```
data(trade)
gravity = feols(log(Euros) ~ log(dist_km) | Destination + Origin, trade)
# Extracting the 'working' number of observations used to compute the pvalues
fitstat(gravity, "g", simplify = TRUE)
#> [1] 15
# Some fit statistics
fitstat(gravity, ~ rmse + r2 + wald + wf)
#> RMSE: 2.26215
#> R2: 0.50428
#> Wald (joint nullity): stat = 272.9, p < 2.2e-16, on 1 and 38,309 DoF, VCOV: Clustered (Destination).
#> F-test (projected): stat = 5,832.8, p < 2.2e-16, on 1 and 38,295 DoF.
# You can use them in etable
etable(gravity, fitstat = ~ rmse + r2 + wald + wf)
#> gravity
#> Dependent Var.: log(Euros)
#>
#> log(dist_km) -2.072*** (0.1254)
#> Fixed-Effects: ------------------
#> Destination Yes
#> Origin Yes
#> ____________________ __________________
#> S.E.: Clustered by: Destination
#> RMSE 2.2622
#> R2 0.50428
#> Wald (joint nullity) 272.90
#> F-test (projected) 5,832.8
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# For wald and wf, you could show the pvalue instead:
etable(gravity, fitstat = ~ rmse + r2 + wald.p + wf.p)
#> gravity
#> Dependent Var.: log(Euros)
#>
#> log(dist_km) -2.072*** (0.1254)
#> Fixed-Effects: ------------------
#> Destination Yes
#> Origin Yes
#> _____________________________ __________________
#> S.E.: Clustered by: Destination
#> RMSE 2.2622
#> R2 0.50428
#> Wald (joint nullity), p-value 4.32e-61
#> F-test (projected), p-value NaNe-Inf
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Now let's display some statistics that are not built-in
# => we use fitstat_register to create them
# We need: a) type name, b) the function to be applied
# c) (optional) an alias
fitstat_register("tstand", function(x) tstat(x, se = "stand")[1], "t-stat (regular)")
fitstat_register("thc", function(x) tstat(x, se = "heter")[1], "t-stat (HC1)")
fitstat_register("t1w", function(x) tstat(x, se = "clus")[1], "t-stat (clustered)")
fitstat_register("t2w", function(x) tstat(x, se = "twow")[1], "t-stat (2-way)")
# Now we can use these keywords in fitstat:
etable(gravity, fitstat = ~ . + tstand + thc + t1w + t2w)
#> gravity
#> Dependent Var.: log(Euros)
#>
#> log(dist_km) -2.072*** (0.1254)
#> Fixed-Effects: ------------------
#> Destination Yes
#> Origin Yes
#> __________________ __________________
#> S.E.: Clustered by: Destination
#> Observations 38,325
#> R2 0.50428
#> Within R2 0.13218
#> t-stat (regular) -76.373
#> t-stat (HC1) -80.129
#> t-stat (clustered) -16.520
#> t-stat (2-way) -13.268
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Note that the custom stats we created are can easily lead
# to errors, but that's another story!
```