Compute local univariate join count — local_jc

The univariate local join count statistic is used to identify clusters of rarely occurring binary variables. The binary variable of interest should occur less than half of the time.

Usage

local_jc_uni(
  fx,
  chosen,
  nb,
  wt = st_weights(nb, style = "B"),
  nsim = 499,
  alternative = "two.sided",
  iseed = NULL
)

Arguments

fx: a binary variable either numeric or logical
chosen: a scalar character containing the level of fx that should be considered the observed value (1).
nb: a neighbors list object.
wt: default st_weights(nb, style = "B"). A binary weights list as created by st_weights(nb, style = "B").
nsim: the number of conditional permutation simulations
alternative: default "greater". One of "less" or "greater".
iseed: default NULL, used to set the seed for possible parallel RNGs

Value

a data.frame with two columns join_count and p_sim and number of rows equal to the length of arguments x, nb, and wt.

Details

The local join count statistic requires a binary weights list which can be generated with st_weights(nb, style = "B"). Additionally, ensure that the binary variable of interest is rarely occurring in no more than half of observations.

P-values are estimated using a conditional permutation approach. This creates a reference distribution from which the observed statistic is compared. For more see Geoda Glossary. Calls spdep::local_joincount_uni().

Examples


if (requireNamespace("dplyr", quietly = TRUE)) {

res <- dplyr::transmute(
  guerry,
  top_crime = as.factor(crime_prop > 9000),
  nb = st_contiguity(geometry),
  wt = st_weights(nb, style = "B"),
  jc = local_jc_uni(top_crime, "TRUE", nb, wt))
tidyr::unnest(res, jc)

}
#> Simple feature collection with 85 features and 5 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 47680 ymin: 1703258 xmax: 1031401 ymax: 2677441
#> CRS:           NA
#> # A tibble: 85 × 6
#>    top_crime nb        wt        BB `Pr(z != E(BBi))`                   geometry
#>    <fct>     <nb>      <list> <dbl>             <dbl>             <MULTIPOLYGON>
#>  1 TRUE      <int [4]> <dbl>      1             0.924 (((801150 2092615, 800669…
#>  2 FALSE     <int [6]> <dbl>      0            NA     (((729326 2521619, 729320…
#>  3 FALSE     <int [6]> <dbl>      0            NA     (((710830 2137350, 711746…
#>  4 FALSE     <int [4]> <dbl>      0            NA     (((882701 1920024, 882408…
#>  5 FALSE     <int [3]> <dbl>      0            NA     (((886504 1922890, 885733…
#>  6 TRUE      <int [7]> <dbl>      2             0.944 (((747008 1925789, 746630…
#>  7 FALSE     <int [3]> <dbl>      0            NA     (((818893 2514767, 818614…
#>  8 TRUE      <int [3]> <dbl>      1             0.876 (((509103 1747787, 508820…
#>  9 FALSE     <int [5]> <dbl>      0            NA     (((775400 2345600, 775068…
#> 10 TRUE      <int [5]> <dbl>      2             0.608 (((626230 1810121, 626269…
#> # ℹ 75 more rows