Skip to contents

Compute VCOVs robust to spatial correlation, a la Conley (1999).

Usage

vcov_conley(
  x,
  lat = NULL,
  lon = NULL,
  cutoff = NULL,
  pixel = 0,
  distance = "triangular",
  ssc = NULL,
  vcov_fix = TRUE
)

conley(cutoff = NULL, pixel = NULL, distance = NULL)

Arguments

x

A fixest object.

lat

A character scalar or a one sided formula giving the name of the variable representing the latitude. The latitude must lie in [-90, 90], [0, 180] or [-180, 0].

lon

A character scalar or a one sided formula giving the name of the variable representing the longitude. The longitude must be in [-180, 180], [0, 360] or [-360, 0].

cutoff

The distance cutoff, in km. You can express the cutoff in miles by writing the number in character form and adding "mi" as a suffix: cutoff = "100mi" would be 100 miles. If missing, a rule of thumb is used to deduce the cutoff, see details.

pixel

A positive numeric scalar, default is 0. If a positive number, the coordinates of each observation are pooled into pixel x pixel km squares. This lowers the precision but can (depending on the cases) greatly improve computational speed at a low precision cost. Note that if the cutoff was expressed in miles, then pixel will also be in miles.

distance

How to compute the distance between points. It can be equal to "triangular" (default) or "spherical". The latter case corresponds to the great circle distance and is more precise than triangular but is a bit more intensive computationally.

ssc

An object returned by the function ssc. It specifies how to perform the small sample correction.

vcov_fix

Logical scalar, default is FALSE. If the VCOV ends up not being positive definite, whether to "fix" it using an eigenvalue decomposition (a la Cameron, Gelbach & Miller 2011). Since the VCOV should be PSD asymptotically, this might be a sign of a problem with using the asymptotic approximation (e.g. too few units in clusters). If a problem is detected, the function will print a message to inform you.

Value

If the first argument is a fixest object, then a VCOV is returned (i.e. a symmetric matrix).

If the first argument is not a fixest object, then a) implicitly the arguments are shifted to the left (i.e. vcov_conley("lat", "long") is equivalent to vcov_conley(lat = "lat", lon = "long")) and b) a VCOV-request is returned and NOT a VCOV. That VCOV-request can then be used in the argument vcov of various fixest functions (e.g. vcov.fixest or even in the estimation calls).

Details

This function computes VCOVs that are robust to spatial correlations by assuming a correlation between the units that are at a geographic distance lower than a given cutoff.

The kernel is uniform.

If the cutoff is not provided, an estimation of it is given. This cutoff ensures that a minimum of units lie within it and is robust to sub-sampling. This automatic cutoff is only here for convenience, the most appropriate cutoff shall depend on the application and shall be provided by the user.

The function conley does not compute VCOVs directly but is meant to be used in the argument vcov of fixest functions (e.g. in vcov.fixest or even in the estimation calls).

If the cutoff is missing, a rule of thumb is used to deduce a sensible cutoff. The algorithm is as follows:

  • all observations are sorted according to their latitude and their longitude (latitude major)

  • for each observation we take the minimum distance across the three units with the closest latitude

  • we do the same when sorting this time by longitude first and latitude second (longitude major)

  • the cutoff is the sum of the median of these two distances (lat. major and lon. major)

This cutoff is provided only for convenience but should be an appropriate first guess. With this cutoff, about 50% of units should have at least around 8 neighbors.

References

Conley TG (1999). "GMM Estimation with Cross Sectional Dependence", Journal of Econometrics, 92, 1-45.

Examples


data(quakes)

# We use conley() in the vcov argument of the estimation
feols(depth ~ mag, quakes, conley(100))
#> OLS estimation, Dep. Var.: depth
#> Observations: 1,000
#> Standard-errors: Conley (100km) 
#>             Estimate Std. Error  t value   Pr(>|t|)    
#> (Intercept)  881.625   108.9005  8.09569 1.6480e-15 ***
#> mag         -123.421    19.2323 -6.41737 2.1389e-10 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> RMSE: 209.6   Adj. R2: 0.052245

# Post estimation
est = feols(depth ~ mag, quakes)
vcov_conley(est, cutoff = 100)
#>             (Intercept)        mag
#> (Intercept)   11859.324 -1955.8155
#> mag           -1955.816   369.8824