Skip to contents

Collinearity is the pebble that inevitably enters the shoe of any applied econometrician in the course of their work. The good news is that econometrics software take that pebble out of your shoe for you. The bad news is that sometimes this pebble is informative and tells you that something is wrong with the model you are estimating.

If you know what collinearity is, how software deal with it, and how it affects your coefficients, you’re fine! If you’re not described by the previous sentence, then two cases when you encounter collinearity. Either you’re lucky, and nothing happens or just a few unimportant coefficients become meaningless. Either you’re unlucky and… well, your main results are wrong and your paper should be retracted.

Without further ado, let’s try to remove chance from the econometrics process and try to understand what collinearity is and how to deal with it.

What collinearity is

This document’s definitions

Definitions. We are in the context of an OLS estimation. We have one dependent variable represented by the NN-vector yy, and the explanatory variables represented by the N×KN \times K matrix XX. The term variable can be used interchangeably with the term column (referring to XX).

Collinearity. There is presence of collinearity if at least one column of the matrix XX is a linear combination of the other columns. Alternatively, using a more sophisticated language, there is collinearity whenever the rank of XX is strictly lower than KK.

Collinear set. The collinear set is the set of all columns that are a linear combination of the other columns.

Regular set. The regular set is the set of all columns that are not a linear combination of the other columns.

Why collinearity can be a problem

Technical reason

The OLS estimator is:

β̂=(XX)1Xy \hat{\beta} = (X^\prime X)^{-1}X^\prime y

In the presence of collinearity the square matrix XXX^\prime X is not of full rank, and hence is not invertible. This is like dividing by 0, you can’t because it does not make sense. Here you cannot invert, so β̂\hat{\beta} is not defined.

Substantial reason

A good theorem to understand how OLS works is the Frisch-Waugh-Lovell (FWL) theorem.

Assume your set of explanatory variables is split in two: the matrix XX with k1k_1 columns and the matrix ZZ with k2k_2 columns (we have K=k1+k2K = k_1 + k_2). Note that you can split the variables in any way you want. The model you want to estimate is:

y=Xβ+Zγ+ϵ y = X\beta + Z\gamma + \epsilon

Define yresidy^{resid} as the residual of yy on ZZ, and XresidX^{resid} as the matrix where each column is the residual of each column of XX on ZZ. Consider the following estimation:

yresid=Xresidθ+ϵresid y^{resid} = X^{resid}\theta + \epsilon^{resid}

Then the theorem states that β̂=θ̂\hat\beta = \hat\theta and that the residuals of the two regressions are the same. You can find an illustration below.

# Illustration of the FWL theorem's magic

# We use the `iris` data set
base = setNames(iris, c("y", "x", "z1", "z2", "species"))

library(fixest)
# The main estimation, we're only interested in `x`'s coefficient
est = feols(y ~ x + z1 + z2, base)

# We estimate both `y` and `x` on the other explanatory variables
#  and get the matrix of residuals
resids = feols(c(y, x) ~ z1 + z2, base) |> resid()
# We estimate y's residuals on x's residuals
est_fwl = feols.fit(resids[, 1], resids[, 2])

# We compare the estimates: they are identical
# The standards errors are also the same, modulo a constant factor
etable(est, est_fwl, order = "x|resid")
#>                                 est            est_fwl
#> Dependent Var.:                   y         resids[,1]
#> 
#> x                0.6508*** (0.0667)
#> resids[,2]                          0.6508*** (0.0660)
#> Constant          1.856*** (0.2508)
#> z1               0.7091*** (0.0567)
#> z2              -0.5565*** (0.1275)
#> _______________ ___________________ __________________
#> S.E. type                       IID                IID
#> Observations                    150                150
#> R2                          0.85861            0.39104
#> Adj. R2                     0.85571            0.39104
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Now let us use the theorem to understand collinearity. Take one variable from the collinear set as the focal variable. By the definition of collinearity, the residual of this variable on all other variables is a NN-vector of 0s. Indeed, since it is a linear combination of the other variables in the collinear set, it can be perfectly explained and there is no residual.

Using the FWL theorem, this means that the coefficient of this variable in the main regression is the same as the coefficient of the regression of the residual of yy on a constant equal to 0! This makes no sense, right. And this is true for any variable in the collinear set!

But what does that mean for variables in the regular set? For any such variable, the residual on all other variables contains variation, and hence the coefficient is well defined.

When is collinearity a problem?

Assume you have two sets of variables: focal variables for which you want to interpret the coefficients, and control variables for which the coefficients are not interpreted. These sets are represented by the matrices XX and ZZ respectively with k1k_1 and k2k_2 columns.

y=Xβ+Zγ+ϵ y = X\beta + Z\gamma + \epsilon

Now assume there is presence of collinearity and the model cannot be directly estimated. There are two cases.

1) The collinear set contains only control variables. In this case collinearity is not an issue since it affects only variables that are not interpreted. The fix is easy: drop iteratively variables from the collinear set until there is no remaining collinearity (reconstruct the collinear set at each step).

Note, importantly, that it is not because it affects only control variables that collinearity is fine. You should always be wary: when there is collinearity in practice when it shouldn’t in theory, this is often the sign of coding mistakes (either the raw data is wrong or the processing is flawed). Allow collinearity only when it makes sense in theory (we’ll cover examples later when collinearity is fine in theory).

2) The collinear set contains at least one focal variable. Now we have a problem. Here collinearity can have two origins: i) the model is misspecified, or ii) the focal variables have been wrongly coded. If you are sure that the model is well specified, then check the data and fix how the variables are constructed. Now let’s cover model misspecification.

The model is misspecified

Err… the model is misspecified, what does that mean? This means that you are trying to estimate stuff that don’t make sense! It’s like accepting the result of the following syllogism:

Let’s add 0=110 = 1 - 1 to 1+11 + 1: 1+1=1+1+(11) 1 + 1 = 1 + 1 + (1 - 1) then let’s substract 22 on both sides: (11)+(11)=(11) \Leftrightarrow (1 - 1) + (1 - 1) = (1 - 1) and finally let’s divide by the same number on both sides: 1+1=1 \Leftrightarrow 1 + 1 = 1

The above equations look right but they’re wrong (notice the last one). Misspecification in an econometric model is the same: it (often) can be estimated, it looks right, but it’s wrong!

The problem with a misspecified model is that there is no technical solution, you have something wrong in the model and you need to fix it – theoretically. To illustrate this, let us give examples, some obvious, others less so.

Example 1: Gender gap in publications

Let’s take the following context: you study the publication process in science and want to quantify the gender gap in terms of publications. You are also interested in the effect of age on publications. You have a panel where scientists are indexed with ii and years are indexed with tt. The dependent variable yy is the number of publications, the focal explanatory variables are womanitwoman_{it} taking the value 1 if the person is a woman and 0 otherwise, and ageitage_{it} is equal to the age of scientist ii at time tt. Finally you want to control for time invariant characteristics specific to each scientist, and year specific factors common to all scientists. That means: you include scientist and year fixed-effects.

You end up with the following specification:

yit=αi+βt+γ×womanit+θ×ageit+ϵit y_{it} = \alpha_i + \beta_t + \gamma \times woman_{it} + \theta \times age_{it} + \epsilon_{it}

The above model is utterly misspecified. Given that the variable womanitwoman_{it} is time invariant (we’ll allow variation and introduce another problem later), it can be fully explained by the scientist fixed-effects. Hence if you want absolutely to control for scientist fixed-effects: you cannot estimate the gender gap, γ̂\hat\gamma. You absolutely can’t, live with it. You really need to be aware of that, especially that econometrics software can give you estimates for γ̂\hat\gamma!!!! Ouch! The behavior of econometrics software is detailed later.

Is that all? Are we done with misspecification? Well, although the variable ageage varies across time and scientists, it is in the collinear set. We can decompose this variable as follows: ageit=yeartbirth_yeariage_{it} = year_{t} - birth\_year_{i}. Now we can see that this variable is a linear combination of a scientist-invariant variable (yeartyear_{t}) and a time-invariant variable (birth_yearibirth\_year_{i}). This means that the variable ageitage_{it} can be fully explained by the scientist and time fixed-effects. Hence you cannot estimate the effect of age in the presence of these fixed-effects.

Any solution? No. You need to change your model. In this case, the easy solution is to drop the scientist fixed-effects and accept the criticisms that may arise from not controlling for important time invariant confounders. To reformulate, if any of your focal variables is in a collinear set, you need to make non trivial changes to your model specification.

Example 2: Sneaky misspecification

We use the same context as above: the gender gap in publications. We introduce the following differences. First the variable womanwoman varies: some scientist can change from man to woman and vice versa. Second, we use the variable harasseditharassed_{it} which is equal to 1 if scientist ii reports being harrassed in year tt, and 0 otherwise. This variable contains missing values – this is important.

The model we estimate is:

yit=αi+βt+γ×womanit+θ×harassedit+ϵit y_{it} = \alpha_i + \beta_t + \gamma \times woman_{it} + \theta \times harassed_{it} + \epsilon_{it}

This time the model is not misspecified since both womanitwoman_{it} and harasseditharassed_{it} present variations that cannot be explained by scientist and time fixed-effects alone (although the identification of the gender gap will hinge only on the scientists reporting a variation in womanitwoman_{it}).

Sneaky misspecification. One problem arises if the variable harassedharassed contains missing values and if, after removing observation with missing values, there is no more within-scientist variation in the variable womanitwoman_{it}. In that case, the variable womanwoman ends up in the collinear set when the variable harassedharassed is included in the model. This is not a misspecified model per se, but purely a data problem: the data does not support the model specification.

In this specific case, one can apply variable imputation to end up with more non-missing values for the variable harassedharassed and be able to estimate the model. Or simply not use the variable harassedharassed at all.

The bottom line is that, even if the model is well specified, modifications of the sample, due to data availability for example, can lead in fine to model misspecification. As mentionned before, the problem is that econmetrics software can provide results even when the model is misspecified, so one really needs to be aware of the problem.

Now let us review how econometrics software deal with collinearity.

Econometrics software and collinearity

Most econometrics software, be it in R, Stata, Python, etc, automatically remove collinear variables to perform the estimation.

Importantly, all software routines assume that the user provides a well specified model. In other words, using the terminology defined above, they assume that the focal variables are not in the collinear set. It is the user’s responsibility to ensure that. If this condition is respected, automatic removal of collinear variables is a feature and not a bug.

In this section we first detail how most software remove collinear variables and how they differ in doing so. We detail why automatic removal is a feature. We end with a description of the pitfalls of automatic removal.

Detecting collinearity: Automatic variable removal algorithms

In general the process of collinearity fixing by variable removal is as follows:

  • for each explanatory variable, from left to right, following the order of the variables given by the user (this is important):
    • check wether the current variable is positive to a collinearity test with the variables previously done:
      • if collinear: remove it
      • else: continue

It is important to stress that we are not talking about mathematical collinearity but only about numerical collinearity which are two different beasts. Indeed, the results of collinearity detection and fixing depend on the rountines’ algorithms. This means that two different routines can lead to different results due to their diverging algorithms – although this should be the excteption rather than the rule.

Let us give two examples. In R, the function lm uses a QR decomposition of the matrix XX to solve the OLS problem. Linear dependencies (collinearity) are resolved during the QR decomposition. Differently from lm, the function feols from fixest resolves collinearity during a Cholesky decompostion of the matrix XXX^\prime X. While there are major advantages in terms of speed to use this method, it is known to be inferior to the QR method for collinearity detection because using XXX^\prime X instead of XX reduces numeric precision. In the vast majority of cases however, the two methods are equivalent and provide similar results.

Collinearity is a numerical problem

To nail the fact that collinearity fixing in econometrics sowftare is a numerical problem, we give the following example. Consider the OLS estimation of this simple model:

y=α+βx+ϵ y = \alpha + \beta x + \epsilon

where xx is a vector equal to 1 for half of the observations and 0 for the others. We generate the data with α=0\alpha = 0 and β=1\beta = 1. You know from your econometrics courses that the estimated coefficients are invariant to translations as long as there is a constant in the model. Let us check that out!

# We generate the data
n = 1e6
n_half = n / 2
df = data.frame(x = rep(0, n))
df$x[1:n_half] = 1
df$y = df$x + rnorm(n)

# we estimate y on x for various translations of x
all_trans = c(0, 10 ** (1:5))
all_results = list()
for(i in seq_along(all_trans)){
  trans = all_trans[i]
  all_results[[i]] = feols(y ~ I(x + trans), df)
}

# we display the results
etable(all_results)
#>                            model 1            model 2            model 3
#> Dependent Var.:                  y                  y                  y      
#> 
#> Constant           0.0013 (0.0014) -9.974*** (0.0210) -99.75*** (0.2009)      
#> I(x+0)          0.9975*** (0.0020)
#> I(x+10)                            0.9975*** (0.0020)
#> I(x+100)                                              0.9975*** (0.0020)      
#> I(x+1000)
#> I(x+10000)
#> I(x+1e+05)
#> _______________ __________________ __________________ __________________      
#> S.E. type                      IID                IID                IID      
#> Observations             1,000,000          1,000,000          1,000,000      
#> R2                         0.19936            0.19936            0.19936      
#> Adj. R2                    0.19936            0.19936            0.19936      
#> 
#>                            model 4             model 5              model 6   
#> Dependent Var.:                  y                   y                    y   
#> 
#> Constant         -997.5*** (2.000) -9,974.9*** (19.99) -99,749.2*** (199.9)   
#> I(x+0)
#> I(x+10)
#> I(x+100)
#> I(x+1000)       0.9975*** (0.0020)
#> I(x+10000)                          0.9975*** (0.0020)
#> I(x+1e+05)                                               0.9975*** (0.0020)   
#> _______________ __________________ ___________________ ____________________   
#> S.E. type                      IID                 IID                  IID   
#> Observations             1,000,000           1,000,000            1,000,000   
#> R2                         0.19936             0.19936              0.19936   
#> Adj. R2                    0.19936             0.19936              0.19936   
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

As we can see, the prediction is true: translations do not change the estimate of β̂\hat\beta. But is that always true? What happens if we apply a bigger translation?

# we add 1,000,000 to x
feols(y ~ I(x + 1e6), df)
#> The variable 'I(x + 1e+06)' has been removed because of collinearity (see $collin.var).
#> OLS estimation, Dep. Var.: y
#> Observations: 1,000,000
#> Standard-errors: IID
#>             Estimate Std. Error t value  Pr(>|t|)    
#> (Intercept) 0.500031   0.001117 447.653 < 2.2e-16 ***
#> ... 1 variable was removed because of collinearity (I(x + 1e+06))
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> RMSE: 1.11701

Ooopsie, our variable is now removed! This happens with the message: “The variable […] removed because of collinearity”. What happened? The algorithm could not invert XXX^\prime X so removed the variable to be able to carry on. This is purely a numerical problem: remember that we store numbers with 64 bits, leading to at most 15 digits precision (so don’t be suprised of the result of 1 + 1e16 == 1e16). If the numbers were stored with 128bits precision, there would be no rounding error and no variable removal.

Now let’s see lm’s algorithm.

lm(y ~ I(x + 1e6), df) |> coef()
#>   (Intercept)  I(x + 1e+06) 
#> -9.974923e+05  9.974923e-01
lm(y ~ I(x + 1e7), df) |> coef()
#> (Intercept) I(x + 1e+07) 
#>    0.500031           NA

Contrary to feols, for a translation of 1,000,000 lm does not remove the variable and gives the right result. However, the behavior becomes similar to that of feols when we apply a translation of 10710^7. A notable difference between the two routines is that there is no explicit message from lm (but this is displayed in lm’s summary).

Bottom line. Collinearity detection and fixing is a numerical problem and as such is subject to the ususal pitfall of these, in particular precision problems. Always be wary when variables are removed: if there is no theoretical reason for removal then there might be a data problem. In general, ensuring that the variables have a reasonable numerical range is a good idea. Do not hesitate to re-scale the variables with a large range.

When automatic removal is good

Consider the context defined in the model misspecification section: where we follow the output of a panel of scientists. Here we want to see the effect of age on performance, when removing individual-specific confounders and confounders linked to the institution of the scientist. Formally, we estimate the following model:

yijt=αi+βj+γageijt+ϵijt, y_{ijt} = \alpha_i + \beta_j + \gamma age_{ijt} + \epsilon_{ijt},

with ii indexing the scientist, jj the institution and tt the year.

Our coefficient of interest is γ\gamma and we include scientist and institution fixed-effects. For this example we use a built-in data set of fixest: base_pub which corresponds to a small panel following the output of researchers across time (this panel is constructed from Microsoft Academic Graph data). Although feols does have a specific way to handle fixed-effects, for illustration we include the two fixed-effects as dummies using the function i():

data(base_pub, package = "fixest")

## The model:
feols(nb_pub ~ age + i(author_id) + i(affil_id), base_pub)
#> The variables 'affil_id::6902469', 'affil_id::9217761', 'affil_id::27504731',
#> 'affil_id::39965400', 'affil_id::43522216', 'affil_id::47301684' and 45 others have been
#> removed because of collinearity (see $collin.var).
#> OLS estimation, Dep. Var.: nb_pub
#> Observations: 4,024
#> Standard-errors: IID
#>                       Estimate Std. Error   t value   Pr(>|t|)
#> (Intercept)          -4.700489   2.396759 -1.961185 4.9934e-02 *
#> age                   0.047252   0.006213  7.605218 3.6032e-14 ***
#> author_id::90561406  -1.458487   0.902767 -1.615574 1.0627e-01
#> author_id::94862465  -3.390346   1.862776 -1.820050 6.8834e-02 .
#> author_id::168896994  0.473991   2.447235  0.193684 8.4643e-01
#> author_id::217986139 -0.133319   1.734549 -0.076861 9.3874e-01    
#> author_id::226108609  0.179560   2.021085  0.088843 9.2921e-01
#> author_id::231631639  2.799524   3.110143  0.900127 3.6811e-01
#> ... 397 coefficients remaining (display them with summary() or use argument n)
#> ... 51 variables were removed because of collinearity (affil_id::6902469,
#> affil_id::9217761 and 49 others [full set in $collin.var])
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> RMSE: 2.21108   Adj. R2: 0.685792

As the output shows clearly, 51 variables have been removed to fix collinearity. In this case, this is perfectly fine since our variable of interest ageage, is not included in the collinear set. Many variables are removed because there is a lot of overlap between the affiliations and the researchers: many researchers have only a few different affiliations across their career, and many affiliations just have one researcher (remember that we deal with a small random sample here).

Note that the recommended way to include fixed-effects in feols is by adding the fixed-effects variables after a pipe. That way the FWL theorem is leveraged and all variables are first residualized on the fixed-effects (see the FWL section). By handling the fixed-effects specifically the output becomes much clearer and you are sure that the variable ageage is not in the collinear set (see next section for why this is important):

feols(nb_pub ~ age | author_id + affil_id, base_pub, vcov = "iid")
#> OLS estimation, Dep. Var.: nb_pub
#> Observations: 4,024
#> Fixed-effects: author_id: 200,  affil_id: 256
#> Standard-errors: IID
#>     Estimate Std. Error t value   Pr(>|t|)
#> age 0.047252   0.006257 7.55144 5.4359e-14 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> RMSE: 2.21108     Adj. R2: 0.681301
#>                 Within R2: 0.015731

Coming back to our first question: Why is automatic variable removal a feature? In this example the scientist and affiliation fixed-effects are just controls and the researcher has no interest in interpreting them. Therefore collinearity is no issue here and it does not matter which variable is removed to fix collinearity.

Automatic variable removal is a critical feature especially for routines that do not handle fixed-effects specifically, like lm. If there was no automatic variable removal in lm, you would not be able to estimate your model even though it is well specified!

When automatic removal bites

Automatic variable removal can be a problem when your model is misspecified and you haven’t noticed it.

We keep using the example on publications. Remember from the section discussing misspecification that the following model is misspecified:

yit=αi+βt+γ×womanit+θ×ageit+ϵit, y_{it} = \alpha_i + \beta_t + \gamma \times woman_{it} + \theta \times age_{it} + \epsilon_{it},

as both womanwoman and ageage are in the collinear set.

We estimate this model anyway by including the scientist and year fixed-effects as dummy variables:

feols(nb_pub ~ is_woman + age + i(author_id) + i(year), base_pub)
#> The variables 'author_id::2747123765' and 'year::2000' have been removed because of
#> collinearity (see $collin.var).
#> OLS estimation, Dep. Var.: nb_pub
#> Observations: 4,024
#> Standard-errors: IID
#>                       Estimate Std. Error   t value Pr(>|t|)
#> (Intercept)           3.224328   2.203459  1.463303  0.14347
#> is_woman             -0.673406   1.624295 -0.414583  0.67847
#> age                   0.046843   0.045423  1.031271  0.30248
#> author_id::90561406  -1.028373   1.093804 -0.940180  0.34719
#> author_id::94862465  -1.953734   0.985021 -1.983444  0.04739 *
#> author_id::168896994 -1.449938   0.914733 -1.585094  0.11303
#> author_id::217986139 -1.576761   0.923925 -1.706591  0.08798 .
#> author_id::226108609 -0.568410   1.171480 -0.485207  0.62756
#> ... 242 coefficients remaining (display them with summary() or use argument n)
#> ... 2 variables were removed because of collinearity (author_id::2747123765 and
#> year::2000)
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> RMSE: 2.96683   Adj. R2: 0.457524

It just worked! We have coefficients for is_woman and age! Good news right, we thought our model was misspecified but in fact it’s not: we can estimate the gender gap with fixed-effects! Or is it?

Well of course this is just a dream. If the model is misspecified in theory, it remains misspecified in practice: the estimates for is_woman and age do exist but cannot be intrepreted. To better see why let’s replicate the previous estimation where we cosmetically modify the fixed-effects:

# same estimation as above
est_num = feols(nb_pub ~ is_woman + age + i(author_id) + i(year), base_pub)
#> The variables 'author_id::2747123765' and 'year::2000' have been removed because of
#> collinearity (see $collin.var).

# we create `author_id_char`: same as `author_id` but in character form
base_pub$author_id_char = as.character(base_pub$author_id)

# replacing `author_id` with `author_id_char`: both variables contain the same information
est_char = feols(nb_pub ~ is_woman + age + i(author_id_char) + i(year), base_pub)
#> The variables 'author_id_char::731914895' and 'year::2000' have been removed because of
#> collinearity (see $collin.var).

etable(est_num, est_char, keep = "woman|age")
#>                         est_num        est_char
#> Dependent Var.:          nb_pub          nb_pub
#> 
#> is_woman        -0.6734 (1.624)   1.729 (3.174)
#> age             0.0468 (0.0454) 0.0468 (0.0454)
#> _______________ _______________ _______________
#> S.E. type                   IID             IID
#> Observations              4,024           4,024
#> R2                      0.49110         0.49110
#> Adj. R2                 0.45752         0.45752
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We can see that although the two estimations are identical in principle, the coefficient associated to the variable is_woman varies widely. But the results should not depend on whether you store a categorical variable in character or numeric form!

The difference is due to the fact that the algorithm removes the variables using the order given by the user. Storing the variable author_id in numeric or charcter form leads to different ordering of the categorical values, and hence a different variable is removed.

Shouldn’t there be an error from the software? As seen in the previous section, automatic removal of collinear variables is useful. The fundemental problem is that the routines: i) can’t compute the collinear sets (this is very costly), and ii) don’t know which variables are the ones of interest. Hence it assumes that the user is knowledgeable and provides a well specified model, for which collinearity is not a problem.

What can the user do? Here’s a trick. To check whether your variables of interest are in a collinear set, simply place them last:

est_last = feols(nb_pub ~ i(author_id) + i(year) + is_woman + age, base_pub)
#> The variables 'is_woman' and 'age' have been removed because of collinearity (see
#> $collin.var).

Now the two variables of interest are out! The sign of model misspecification couldn’t be clearer!

If you use routines that handle fixed-effects specifically, like feols, try to treat all your categorical variables as fixed-effects. This way you have the garantee that the variables of interest are not in the same collinear set as the fixed-effects. Here is the same example with explicit fixed-effects:

feols(nb_pub ~ is_woman + age | author_id + year, base_pub)
#> Error: in feols(nb_pub ~ is_woman + age | author_id + year,...:
#> All variables, 'is_woman' and 'age', are collinear with the fixed effects. Without
#> doubt, your model is misspecified.

The routine detects that the variables ‘is_woman’ and ‘age’ are collinear with the fixed-effects, and removes them from the estimation. Since there is no more variable, the estimation makes no sense and you have a suggestion that your model may be misspecified.