This function shows the means and standard-deviations of several variables conditional on whether they are from the treated or the control group. The groups can further be split according to a pre/post variable. Results can be seamlessly be exported to Latex.
Usage
did_means(
fml,
base,
treat_var,
post_var,
tex = FALSE,
treat_dict,
dict = getFixest_dict(),
file,
replace = FALSE,
title,
label,
raw = FALSE,
indiv,
treat_first,
prepostnames = c("Before", "After"),
diff.inv = FALSE
)
Arguments
- fml
Either a formula of the type
var1 + ... + varN ~ treat
orvar1 + ... + varN ~ treat | post
. Either a data.frame/matrix containing all the variables for which the means are to be computed (they must be numeric of course). Both the treatment and the post variables must contain only exactly two values. You can use a point to select all the variables of the data set:. ~ treat
.- base
A data base containing all the variables in the formula
fml
.- treat_var
Only if argument
fml
is not a formula. The vector identifying the treated and the control observations (the vector can be of any type but must contain only two possible values). Must be of the same length as the data.- post_var
Only if argument
fml
is not a formula. The vector identifying the periods (pre/post) of the observations (the vector can be of any type but must contain only two possible values). The first value (in the sorted sense) of the vector is taken as the pre period. Must be of the same length as the data.- tex
Should the result be displayed in Latex? Default is
FALSE
. Automatically set toTRUE
if the table is to be saved in a file using the argumentfile
.- treat_dict
A character vector of length two. What are the names of the treated and the control? This should be a dictionary: e.g.
c("1"="Treated", "0" = "Control")
.- dict
A named character vector. A dictionary between the variables names and an alias. For instance
dict=c("x"="Inflation Rate")
would replace the variable namex
by “Inflation Rate”.- file
A file path. If given, the table is written in Latex into this file.
- replace
Default is
TRUE
, which means that when the table is exported, the existing file is not erased.- title
Character string giving the Latex title of the table. (Only if exported.)
- label
Character string giving the Latex label of the table. (Only if exported.)
- raw
Logical, default is
FALSE
. IfTRUE
, it returns the information without formatting.- indiv
Either the variable name of individual identifiers, a one sided formula, or a vector. If the data is that of a panel, this can be used to track the number of individuals per group.
- treat_first
Which value of the 'treatment' vector should appear on the left? By default the max value appears first (e.g. if the treatment variable is a 0/1 vector, 1 appears first).
- prepostnames
Only if there is a 'post' variable. The names of the pre and post periods to be displayed in Latex. Default is
c("Before", "After")
.- diff.inv
Logical, default to
FALSE
. Whether to inverse the difference.
Value
It returns a data.frame or a Latex table with the conditional means and statistical differences between the groups.
Details
By default, when the user tries to apply this function to nun-numeric variables, an error is raised. The exception is when the all variables are selected with the dot (like in . ~ treat
. In this case, non-numeric variables are automatically omitted (with a message).
NAs are removed automatically: if the data contains NAs an information message will be prompted. First all observations containing NAs relating to the treatment or post variables are removed. Then if there are still NAs for the variables, they are excluded separately for each variable, and a new message detailing the NA breakup is prompted.
Examples
# Playing around with the DiD data
data(base_did)
# means of treat/control
did_means(y+x1+period~treat, base_did)
#> vars cond: 1 cond: 0 Difference t-stat
#> 1 y 3.3 (6) 0.68 (5) 2.64 7.83
#> 2 x1 0.13 (3.1) -0.066 (2.8) 0.199 1.1
#> 3 period 5.5 (2.9) 5.5 (2.9) 0 0
#> 4 Observations 550 530
# same but inverting the difference
did_means(y+x1+period~treat, base_did, diff.inv = TRUE)
#> vars cond: 1 cond: 0 Difference t-stat
#> 1 y 3.3 (6) 0.68 (5) -2.64 -7.83
#> 2 x1 0.13 (3.1) -0.066 (2.8) -0.199 -1.1
#> 3 period 5.5 (2.9) 5.5 (2.9) 0 0
#> 4 Observations 550 530
# now treat/control, before/after
did_means(y+x1+period~treat|post, base_did)
#> vars cond: 1 cond: 0 Difference t-stat cond: 1 cond: 0
#> 1 y 0.47 (5.1) 0.32 (5) 0.142 0.326 6.2 (5.5) 1 (5)
#> 2 x1 0.17 (3.1) 0.046 (2.9) 0.125 0.487 0.095 (3.1) -0.18 (2.8)
#> 3 period 3 (1.4) 3 (1.4) 0 0 8 (1.4) 8 (1.4)
#> 4 Observations 275 265 275 265
#> Difference t-stat
#> 1 5.14 11.4
#> 2 0.272 1.07
#> 3 0 0
#> 4
# same but with a new line giving the number of unique "indiv" for each case
did_means(y+x1+period~treat|post, base_did, indiv = "id")
#> vars cond: 1 cond: 0 Difference t-stat cond: 1
#> 1 y 0.47 (5.1) 0.32 (5) 0.142 0.326 6.2 (5.5)
#> 2 x1 0.17 (3.1) 0.046 (2.9) 0.125 0.487 0.095 (3.1)
#> 3 period 3 (1.4) 3 (1.4) 0 0 8 (1.4)
#> 4 Observations 275 265 275
#> 5 # Individuals 55 53 55
#> cond: 0 Difference t-stat
#> 1 1 (5) 5.14 11.4
#> 2 -0.18 (2.8) 0.272 1.07
#> 3 8 (1.4) 0 0
#> 4 265
#> 5 53
# same but with the treat case "0" coming first
did_means(y+x1+period~treat|post, base_did, indiv = ~id, treat_first = 0)
#> vars cond: 0 cond: 1 Difference t-stat cond: 0
#> 1 y 0.32 (5) 0.47 (5.1) -0.142 -0.326 1 (5)
#> 2 x1 0.046 (2.9) 0.17 (3.1) -0.125 -0.487 -0.18 (2.8)
#> 3 period 3 (1.4) 3 (1.4) 0 0 8 (1.4)
#> 4 Observations 265 275 265
#> 5 # Individuals 53 55 53
#> cond: 1 Difference t-stat
#> 1 6.2 (5.5) -5.14 -11.4
#> 2 0.095 (3.1) -0.272 -1.07
#> 3 8 (1.4) 0 0
#> 4 275
#> 5 55
# Selecting all the variables with "."
did_means(.~treat|post, base_did, indiv = "id")
#> vars cond: 1 cond: 0 Difference t-stat cond: 1
#> 1 y 0.47 (5.1) 0.32 (5) 0.142 0.326 6.2 (5.5)
#> 2 x1 0.17 (3.1) 0.046 (2.9) 0.125 0.487 0.095 (3.1)
#> 3 period 3 (1.4) 3 (1.4) 0 0 8 (1.4)
#> 4 Observations 275 265 275
#> 5 # Individuals 55 53 55
#> cond: 0 Difference t-stat
#> 1 1 (5) 5.14 11.4
#> 2 -0.18 (2.8) 0.272 1.07
#> 3 8 (1.4) 0 0
#> 4 265
#> 5 53