Constructs a `fixest`

panel data base out of a data.frame which allows to use leads and lags
in `fixest`

estimations and to create new variables from leads and lags if the data.frame
was also a `data.table::data.table`

.

## Usage

`panel(data, panel.id, time.step = NULL, duplicate.method = c("none", "first"))`

## Arguments

- data
A data.frame.

- panel.id
The panel identifiers. Can either be: i) a one sided formula (e.g.

`panel.id = ~id+time`

), ii) a character vector of length 2 (e.g.`panel.id=c('id', 'time')`

, or iii) a character scalar of two variables separated by a comma (e.g.`panel.id='id,time'`

). Note that you can combine variables with`^`

only inside formulas (see the dedicated section in`feols`

).- time.step
The method to compute the lags, default is

`NULL`

(which means automatically set). Can be equal to:`"unitary"`

,`"consecutive"`

,`"within.consecutive"`

, or to a number. If`"unitary"`

, then the largest common divisor between consecutive time periods is used (typically if the time variable represents years, it will be 1). This method can apply only to integer (or convertible to integer) variables. If`"consecutive"`

, then the time variable can be of any type: two successive time periods represent a lag of 1. If`"witihn.consecutive"`

then**within a given id**, two successive time periods represent a lag of 1. Finally, if the time variable is numeric, you can provide your own numeric time step.- duplicate.method
If several observations have the same id and time values, then the notion of lag is not defined for them. If

`duplicate.method = "none"`

(default) and duplicate values are found, this leads to an error. You can use`duplicate.method = "first"`

so that the first occurrence of identical id/time observations will be used as lag.

## Value

It returns a data base identical to the one given in input, but with an additional attribute: “panel_info”. This attribute contains vectors used to efficiently create lags/leads of the data. When the data is subselected, some bookeeping is performed on the attribute “panel_info”.

## Details

This function allows you to use leads and lags in a `fixest`

estimation without having to
provide the argument `panel.id`

. It also offers more options on how to set the panel
(with the additional arguments 'time.step' and 'duplicate.method').

When the initial data set was also a `data.table`

, not all operations are supported and some may
dissolve the `fixest_panel`

. This is the case when creating subselections of the initial data
with additional attributes (e.g. `pdt[x>0, .(x, y, z)]`

would dissolve the `fixest_panel`

,
meaning only a data.table would be the result of the call).

If the initial data set was also a `data.table`

, then you can create new variables from lags
and leads using the functions `l`

and `f`

. See the example.

## Examples

```
data(base_did)
# Setting a data set as a panel...
pdat = panel(base_did, ~id+period)
# ...then using the functions l and f
est1 = feols(y~l(x1, 0:1), pdat)
#> NOTE: 108 observations removed because of NA values (RHS: 108).
est2 = feols(f(y)~l(x1, -1:1), pdat)
#> NOTE: 216 observations removed because of NA values (LHS: 108, RHS: 216).
est3 = feols(l(y)~l(x1, 0:3), pdat)
#> NOTE: 324 observations removed because of NA values (LHS: 108, RHS: 324).
etable(est1, est2, est3, order = c("f", "^x"), drop="Int")
#> est1 est2 est3
#> Dependent Var.: y f(y,1) l(y,1)
#>
#> f(x1,1) 0.9940*** (0.0542)
#> x1 0.9948*** (0.0487) 0.0081 (0.0592) -0.0534 (0.0545)
#> Constant 2.235*** (0.2032) 2.464*** (0.2233) 2.196*** (0.2110)
#> l(x1,1) 0.0410 (0.0558) 0.0157 (0.0640) 0.9871*** (0.0551)
#> l(x1,2) 0.0220 (0.0580)
#> l(x1,3) 0.0102 (0.0639)
#> _______________ __________________ __________________ __________________
#> S.E.: Clustered by: id by: id by: id
#> Observations 972 864 756
#> R2 0.26558 0.25697 0.25875
#> Adj. R2 0.26406 0.25438 0.25480
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# or using the argument panel.id
feols(f(y)~l(x1, -1:1), base_did, panel.id = ~id+period)
#> NOTE: 216 observations removed because of NA values (LHS: 108, RHS: 216).
#> OLS estimation, Dep. Var.: f(y, 1)
#> Observations: 864
#> Standard-errors: Clustered (id)
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 2.464313 0.223277 11.037009 < 2.2e-16 ***
#> f(x1, 1) 0.994018 0.054216 18.334504 < 2.2e-16 ***
#> x1 0.008072 0.059247 0.136241 0.89189
#> l(x1, 1) 0.015693 0.063958 0.245360 0.80665
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> RMSE: 4.97418 Adj. R2: 0.254377
# You can use panel.id in various ways:
pdat = panel(base_did, ~id+period)
# is identical to:
pdat = panel(base_did, c("id", "period"))
# and also to:
pdat = panel(base_did, "id,period")
# l() and f() can also be used within a data.table:
if(require("data.table")){
pdat_dt = panel(as.data.table(base_did), ~id+period)
# Now since pdat_dt is also a data.table
# you can create lags/leads directly
pdat_dt[, x1_l1 := l(x1)]
pdat_dt[, c("x1_l1_fill0", "y_f2") := .(l(x1, fill = 0), f(y, 2))]
}
```