The univariate local join count statistic is used to identify clusters of rarely occurring binary variables. The binary variable of interest should occur less than half of the time.
Usage
local_jc_uni(
fx,
chosen,
nb,
wt = st_weights(nb, style = "B"),
nsim = 499,
alternative = "two.sided",
iseed = NULL
)
Arguments
- fx
a binary variable either numeric or logical
- chosen
a scalar character containing the level of
fx
that should be considered the observed value (1).- nb
a neighbors list object.
- wt
default
st_weights(nb, style = "B")
. A binary weights list as created byst_weights(nb, style = "B")
.- nsim
the number of conditional permutation simulations
- alternative
default
"greater"
. One of"less"
or"greater"
.- iseed
default NULL, used to set the seed for possible parallel RNGs
Value
a data.frame
with two columns join_count
and p_sim
and number of rows equal to the length of arguments x
, nb
, and wt
.
Details
The local join count statistic requires a binary weights list which can be generated with st_weights(nb, style = "B")
. Additionally, ensure that the binary variable of interest is rarely occurring in no more than half of observations.
P-values are estimated using a conditional permutation approach. This creates a reference distribution from which the observed statistic is compared. For more see Geoda Glossary.
Calls spdep::local_joincount_uni()
.
Examples
if (requireNamespace("dplyr", quietly = TRUE)) {
res <- dplyr::transmute(
guerry,
top_crime = as.factor(crime_prop > 9000),
nb = st_contiguity(geometry),
wt = st_weights(nb, style = "B"),
jc = local_jc_uni(top_crime, "TRUE", nb, wt))
tidyr::unnest(res, jc)
}
#> Simple feature collection with 85 features and 5 fields
#> Geometry type: MULTIPOLYGON
#> Dimension: XY
#> Bounding box: xmin: 47680 ymin: 1703258 xmax: 1031401 ymax: 2677441
#> CRS: NA
#> # A tibble: 85 × 6
#> top_crime nb wt BB `Pr(z != E(BBi))` geometry
#> <fct> <nb> <list> <dbl> <dbl> <MULTIPOLYGON>
#> 1 TRUE <int [4]> <dbl> 1 0.924 (((801150 2092615, 800669…
#> 2 FALSE <int [6]> <dbl> 0 NA (((729326 2521619, 729320…
#> 3 FALSE <int [6]> <dbl> 0 NA (((710830 2137350, 711746…
#> 4 FALSE <int [4]> <dbl> 0 NA (((882701 1920024, 882408…
#> 5 FALSE <int [3]> <dbl> 0 NA (((886504 1922890, 885733…
#> 6 TRUE <int [7]> <dbl> 2 0.944 (((747008 1925789, 746630…
#> 7 FALSE <int [3]> <dbl> 0 NA (((818893 2514767, 818614…
#> 8 TRUE <int [3]> <dbl> 1 0.876 (((509103 1747787, 508820…
#> 9 FALSE <int [5]> <dbl> 0 NA (((775400 2345600, 775068…
#> 10 TRUE <int [5]> <dbl> 2 0.608 (((626230 1810121, 626269…
#> # ℹ 75 more rows