Treat a variable as a factor, or interacts a variable with a factor. Values to be dropped/kept from the factor can be easily set. Note that to interact fixed-effects, this function should not be used: instead use directly the syntax `fe1^fe2`

.

`i(factor_var, var, ref, keep, bin, ref2, keep2, bin2, ...)`

- factor_var
A vector (of any type) that will be treated as a factor. You can set references (i.e. exclude values for which to create dummies) with the

`ref`

argument.- var
A variable of the same length as

`factor_var`

. This variable will be interacted with the factor in`factor_var`

. It can be numeric or factor-like. To force a numeric variable to be treated as a factor, you can add the`i.`

prefix to a variable name. For instance take a numeric variable`x_num`

:`i(x_fact, x_num)`

will treat`x_num`

as numeric while`i(x_fact, i.x_num)`

will treat`x_num`

as a factor (it's a shortcut to`as.factor(x_num)`

).- ref
A vector of values to be taken as references from

`factor_var`

. Can also be a logical: if`TRUE`

, then the first value of`factor_var`

will be removed. If`ref`

is a character vector, partial matching is applied to values; use "@" as the first character to enable regular expression matching. See examples.- keep
A vector of values to be kept from

`factor_var`

(all others are dropped). By default they should be values from`factor_var`

and if`keep`

is a character vector partial matching is applied. Use "@" as the first character to enable regular expression matching instead.- bin
A list of values to be grouped, a vector, a formula, or the special values

`"bin::digit"`

or`"cut::values"`

. To create a new value from old values, use`bin = list("new_value"=old_values)`

with`old_values`

a vector of existing values. You can use`.()`

for`list()`

. It accepts regular expressions, but they must start with an`"@"`

, like in`bin="@Aug|Dec"`

. It accepts one-sided formulas which must contain the variable`x`

, e.g.`bin=list("<2" = ~x < 2)`

. The names of the list are the new names. If the new name is missing, the first value matched becomes the new name. In the name, adding`"@d"`

, with`d`

a digit, will relocate the value in position`d`

: useful to change the position of factors. Use`"@"`

as first item to make subsequent items be located first in the factor. Feeding in a vector is like using a list without name and only a single element. If the vector is numeric, you can use the special value`"bin::digit"`

to group every`digit`

element. For example if`x`

represents years, using`bin="bin::2"`

creates bins of two years. With any data, using`"!bin::digit"`

groups every digit consecutive values starting from the first value. Using`"!!bin::digit"`

is the same but starting from the last value. With numeric vectors you can: a) use`"cut::n"`

to cut the vector into`n`

equal parts, b) use`"cut::a]b["`

to create the following bins:`[min, a]`

,`]a, b[`

,`[b, max]`

. The latter syntax is a sequence of number/quartile (q0 to q4)/percentile (p0 to p100) followed by an open or closed square bracket. You can add custom bin names by adding them in the character vector after`'cut::values'`

. See details and examples. Dot square bracket expansion (see`dsb`

) is enabled.- ref2
A vector of values to be dropped from

`var`

. By default they should be values from`var`

and if`ref2`

is a character vector partial matching is applied. Use "@" as the first character to enable regular expression matching instead.- keep2
A vector of values to be kept from

`var`

(all others are dropped). By default they should be values from`var`

and if`keep2`

is a character vector partial matching is applied. Use "@" as the first character to enable regular expression matching instead.- bin2
A list or vector defining the binning of the second variable. See help for the argument

`bin`

for details (or look at the help of the function`bin`

). You can use`.()`

for`list()`

.- ...
Not currently used.

It returns a matrix with number of rows the length of `factor_var`

. If there is no interacted variable or it is interacted with a numeric variable, the number of columns is equal to the number of cases contained in `factor_var`

minus the reference(s). If the interacted variable is a factor, the number of columns is the number of combined cases between `factor_var`

and `var`

.

To interact fixed-effects, this function should not be used: instead use directly the syntax `fe1^fe2`

in the fixed-effects part of the formula. Please see the details and examples in the help page of `feols`

.

```
#
# Simple illustration
#
x = rep(letters[1:4], 3)[1:10]
y = rep(1:4, c(1, 2, 3, 4))
# interaction
data.frame(x, y, i(x, y, ref = TRUE))
#> x y b c d
#> 1 a 1 0 0 0
#> 2 b 2 2 0 0
#> 3 c 2 0 2 0
#> 4 d 3 0 0 3
#> 5 a 3 0 0 0
#> 6 b 3 3 0 0
#> 7 c 4 0 4 0
#> 8 d 4 0 0 4
#> 9 a 4 0 0 0
#> 10 b 4 4 0 0
# without interaction
data.frame(x, i(x, "b"))
#> x a c d
#> 1 a 1 0 0
#> 2 b 0 0 0
#> 3 c 0 1 0
#> 4 d 0 0 1
#> 5 a 1 0 0
#> 6 b 0 0 0
#> 7 c 0 1 0
#> 8 d 0 0 1
#> 9 a 1 0 0
#> 10 b 0 0 0
# you can interact factors too
z = rep(c("e", "f", "g"), c(5, 3, 2))
data.frame(x, z, i(x, z))
#> x z a.e a.g b.e b.f b.g c.e c.f d.e d.f
#> 1 a e 1 0 0 0 0 0 0 0 0
#> 2 b e 0 0 1 0 0 0 0 0 0
#> 3 c e 0 0 0 0 0 1 0 0 0
#> 4 d e 0 0 0 0 0 0 0 1 0
#> 5 a e 1 0 0 0 0 0 0 0 0
#> 6 b f 0 0 0 1 0 0 0 0 0
#> 7 c f 0 0 0 0 0 0 1 0 0
#> 8 d f 0 0 0 0 0 0 0 0 1
#> 9 a g 0 1 0 0 0 0 0 0 0
#> 10 b g 0 0 0 0 1 0 0 0 0
# to force a numeric variable to be treated as a factor: use i.
data.frame(x, y, i(x, i.y))
#> x y a.1 a.3 a.4 b.2 b.3 b.4 c.2 c.4 d.3 d.4
#> 1 a 1 1 0 0 0 0 0 0 0 0 0
#> 2 b 2 0 0 0 1 0 0 0 0 0 0
#> 3 c 2 0 0 0 0 0 0 1 0 0 0
#> 4 d 3 0 0 0 0 0 0 0 0 1 0
#> 5 a 3 0 1 0 0 0 0 0 0 0 0
#> 6 b 3 0 0 0 0 1 0 0 0 0 0
#> 7 c 4 0 0 0 0 0 0 0 1 0 0
#> 8 d 4 0 0 0 0 0 0 0 0 0 1
#> 9 a 4 0 0 1 0 0 0 0 0 0 0
#> 10 b 4 0 0 0 0 0 1 0 0 0 0
# Binning
data.frame(x, i(x, bin = list(ab = c("a", "b"))))
#> x ab c d
#> 1 a 1 0 0
#> 2 b 1 0 0
#> 3 c 0 1 0
#> 4 d 0 0 1
#> 5 a 1 0 0
#> 6 b 1 0 0
#> 7 c 0 1 0
#> 8 d 0 0 1
#> 9 a 1 0 0
#> 10 b 1 0 0
# Same as before but using .() for list() and a regular expression
# note that to trigger a regex, you need to use an @ first
data.frame(x, i(x, bin = .(ab = "@a|b")))
#> x ab c d
#> 1 a 1 0 0
#> 2 b 1 0 0
#> 3 c 0 1 0
#> 4 d 0 0 1
#> 5 a 1 0 0
#> 6 b 1 0 0
#> 7 c 0 1 0
#> 8 d 0 0 1
#> 9 a 1 0 0
#> 10 b 1 0 0
#
# In fixest estimations
#
data(base_did)
# We interact the variable 'period' with the variable 'treat'
est_did = feols(y ~ x1 + i(period, treat, 5) | id + period, base_did)
# => plot only interactions with iplot
iplot(est_did)
# Using i() for factors
est_bis = feols(y ~ x1 + i(period, keep = 3:6) + i(period, treat, 5) | id, base_did)
# we plot the second set of variables created with i()
# => we need to use keep (otherwise only the first one is represented)
coefplot(est_bis, keep = "trea")
# => special treatment in etable
etable(est_bis, dict = c("6" = "six"))
#> est_bis
#> Dependent Var.: y
#>
#> x1 0.9720*** (0.0456)
#> period = 3 -1.111* (0.5354)
#> period = 4 0.4034 (0.5423)
#> period = 5 -0.8980 (0.5698)
#> period = six 0.8031 (0.5467)
#> treat x period = 1 -2.252* (1.002)
#> treat x period = 2 -1.523 (0.9927)
#> treat x period = 3 -0.2720 (1.104)
#> treat x period = 4 -1.794 (1.086)
#> treat x period = six 0.7850 (1.026)
#> treat x period = 7 3.650*** (0.9172)
#> treat x period = 8 4.310*** (0.9989)
#> treat x period = 9 5.636*** (1.037)
#> treat x period = 10 6.276*** (1.045)
#> Fixed-Effects: ------------------
#> id Yes
#> ____________________ __________________
#> S.E.: Clustered by: id
#> Observations 1,080
#> R2 0.54466
#> Within R2 0.45396
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#
# Interact two factors
#
# We use the i. prefix to consider week as a factor
data(airquality)
aq = airquality
aq$week = aq$Day %/% 7 + 1
# Interacting Month and week:
res_2F = feols(Ozone ~ Solar.R + i(Month, i.week), aq)
#> NOTE: 42 observations removed because of NA values (LHS: 37, RHS: 7).
# Same but dropping the 5th Month and 1st week
res_2F_bis = feols(Ozone ~ Solar.R + i(Month, i.week, ref = 5, ref2 = 1), aq)
#> NOTE: 42 observations removed because of NA values (LHS: 37, RHS: 7).
etable(res_2F, res_2F_bis)
#> res_2F res_2F_bis
#> Dependent Var.: Ozone Ozone
#>
#> (Intercept) 8.207 (14.16) 18.51* (7.343)
#> Solar.R 0.0963** (0.0314) 0.1007** (0.0324)
#> Month = 5 x week = 2 -11.36 (17.18)
#> Month = 5 x week = 3 -9.660 (16.05)
#> Month = 5 x week = 4 -6.923 (18.28)
#> Month = 5 x week = 5 28.32 (18.10)
#> Month = 6 x week = 2 10.88 (18.13) -0.3936 (14.93)
#> Month = 6 x week = 3 -2.422 (17.22) -13.40 (13.47)
#> Month = 7 x week = 1 31.87. (17.27)
#> Month = 7 x week = 2 34.35* (16.59) 23.00. (12.58)
#> Month = 7 x week = 3 20.17 (16.54) 8.938 (12.47)
#> Month = 7 x week = 4 33.76. (17.26) 22.85. (13.51)
#> Month = 7 x week = 5 31.58. (18.19) 20.19 (15.04)
#> Month = 8 x week = 1 7.218 (19.98)
#> Month = 8 x week = 2 48.12** (17.22) 36.81** (13.56)
#> Month = 8 x week = 3 19.17 (16.62) 8.257 (12.48)
#> Month = 8 x week = 4 36.50* (17.18) 25.35. (13.46)
#> Month = 8 x week = 5 62.00*** (18.12) 50.76*** (14.91)
#> Month = 9 x week = 1 46.47** (16.57)
#> Month = 9 x week = 2 -5.661 (16.12) -17.03 (11.82)
#> Month = 9 x week = 3 -2.978 (16.10) -13.95 (11.65)
#> Month = 9 x week = 4 1.809 (16.73) -8.973 (12.61)
#> Month = 9 x week = 5 -8.373 (19.56) -19.47 (16.95)
#> ____________________ _________________ _________________
#> S.E. type IID IID
#> Observations 111 111
#> R2 0.52636 0.37684
#> Adj. R2 0.40795 0.27844
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#
# Binning
#
data(airquality)
feols(Ozone ~ i(Month, bin = "bin::2"), airquality)
#> NOTE: 37 observations removed because of NA values (LHS: 37).
#> OLS estimation, Dep. Var.: Ozone
#> Observations: 116
#> Standard-errors: IID
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 23.6154 6.19450 3.81231 0.00022469 ***
#> Month::6 27.8703 8.17782 3.40804 0.00090749 ***
#> Month::8 21.3119 7.51740 2.83501 0.00543040 **
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> RMSE: 31.2 Adj. R2: 0.083194
feols(Ozone ~ i(Month, bin = list(summer = 7:9)), airquality)
#> NOTE: 37 observations removed because of NA values (LHS: 37).
#> OLS estimation, Dep. Var.: Ozone
#> Observations: 116
#> Standard-errors: IID
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 23.61538 6.13010 3.852364 0.00019455 ***
#> Month::6 5.82906 12.08872 0.482190 0.63060377
#> Month::summer 25.86610 7.04559 3.671249 0.00037013 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> RMSE: 30.9 Adj. R2: 0.102158
```