Skip to contents

While not too far from the initial CRAN release of sfdep, this version consists of two sizable additions relating to spatio-temporal analysis and colocation analysis as well as bug fixes and improvements. Additional functionality is added regarding point-pattern analysis and neighbor / weight list utilities. Given the breadth of the additions, this represents a minor version release.

Colocation Analysis

This release includes functionality to compute colocation quotients. This is extremely exciting as this is the first open source implementation of Colocation Quotients that I am aware of. There are three types of colocation quotient (CLQ) in this release: the global CLQ, pairwise CLQ, and local CLQ.

See the colocation vignette for a more detailed write up.

There are three new functions for calculating CLQs:

Spacetime

And probably most notable, is the introduction of a new spacetime class. This class was created as a byproduct of creating functionality for emerging hot spot analysis. For a more detailed write up see the spacetime vignette.

The new functions that are available are:

df_fp <- system.file("extdata", "bos-ecometric.csv", package = "sfdep")
geo_fp <- system.file("extdata", "bos-ecometric.geojson", package = "sfdep")

# read in data
df <- readr::read_csv(df_fp, col_types = "ccidD")
geo <- sf::read_sf(geo_fp)

# create spacetime object
bos <- spacetime(df, geo, ".region_id", "year")

bos
#> spacetime ────
#> Context:`data`
#> 168 locations `.region_id`
#> 10 time periods `year`
#> ── data context ────────────────────────────────────────────────────────────────
#> # A tibble: 1,680 × 5
#>    .region_id  ecometric  year value time_period
#>  * <chr>       <chr>     <int> <dbl> <date>     
#>  1 25025010405 Guns       2010  0.35 2010-01-01 
#>  2 25025010405 Guns       2011  0.89 2011-01-01 
#>  3 25025010405 Guns       2012  1.2  2012-01-01 
#>  4 25025010405 Guns       2013  0.84 2013-01-01 
#>  5 25025010405 Guns       2014  1.5  2014-01-01 
#>  6 25025010405 Guns       2015  1.15 2015-01-01 
#>  7 25025010405 Guns       2016  1.48 2016-01-01 
#>  8 25025010405 Guns       2017  1.64 2017-01-01 
#>  9 25025010405 Guns       2018  0.49 2018-01-01 
#> 10 25025010405 Guns       2019  0.17 2019-01-01 
#> # … with 1,670 more rows

Spacetime objects have two contexts: data and geometry. Currently the data is activated. We can activate the geometry context like so:

activate(bos, "geometry") 
#> spacetime ────
#> Context:`geometry`
#> 168 locations `.region_id`
#> 10 time periods `year`
#> ── geometry context ────────────────────────────────────────────────────────────
#> Simple feature collection with 168 features and 1 field
#> Geometry type: POLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -71.19115 ymin: 42.22788 xmax: -70.99445 ymax: 42.3974
#> Geodetic CRS:  NAD83
#> # A tibble: 168 × 2
#>    .region_id                                                           geometry
#>  * <chr>                                                           <POLYGON [°]>
#>  1 25025010405 ((-71.09009 42.34666, -71.09001 42.34668, -71.08941 42.34685, -7…
#>  2 25025010404 ((-71.09066 42.33977, -71.09103 42.33961, -71.09177 42.33989, -7…
#>  3 25025010801 ((-71.08159 42.3537, -71.08044 42.35402, -71.07995 42.35415, -71…
#>  4 25025010702 ((-71.07066 42.35185, -71.07045 42.35142, -71.07282 42.35075, -7…
#>  5 25025010204 ((-71.10683 42.34875, -71.1052 42.34844, -71.10468 42.34834, -71…
#>  6 25025010802 ((-71.08159 42.3537, -71.08153 42.35358, -71.08145 42.3534, -71.…
#>  7 25025010104 ((-71.08784 42.34746, -71.08805 42.34746, -71.0883 42.34747, -71…
#>  8 25025000703 ((-71.12622 42.35041, -71.12685 42.35009, -71.12748 42.35, -71.1…
#>  9 25025000504 ((-71.14175 42.3404, -71.14194 42.34001, -71.14234 42.34005, -71…
#> 10 25025000704 ((-71.13551 42.34878, -71.13572 42.34904, -71.1358 42.34917, -71…
#> # … with 158 more rows

This is handy because we can find neighbors in our geometry and link them to our data.

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

bos_nb <- bos |> 
  activate("geometry") |> 
  mutate(nb = st_contiguity(geometry),
         wt = st_weights(nb))

bos_nb
#> spacetime ────
#> Context:`geometry`
#> 168 locations `.region_id`
#> 10 time periods `year`
#> ── geometry context ────────────────────────────────────────────────────────────
#> Simple feature collection with 168 features and 3 fields
#> Geometry type: POLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -71.19115 ymin: 42.22788 xmax: -70.99445 ymax: 42.3974
#> Geodetic CRS:  NAD83
#> # A tibble: 168 × 4
#>    .region_id                                            geometry nb      wt    
#>  * <chr>                                            <POLYGON [°]> <nb>    <list>
#>  1 25025010405 ((-71.09009 42.34666, -71.09001 42.34668, -71.089… <int [<dbl 
#>  2 25025010404 ((-71.09066 42.33977, -71.09103 42.33961, -71.091… <int [<dbl 
#>  3 25025010801 ((-71.08159 42.3537, -71.08044 42.35402, -71.0799… <int [<dbl 
#>  4 25025010702 ((-71.07066 42.35185, -71.07045 42.35142, -71.072… <int [<dbl 
#>  5 25025010204 ((-71.10683 42.34875, -71.1052 42.34844, -71.1046… <int [<dbl 
#>  6 25025010802 ((-71.08159 42.3537, -71.08153 42.35358, -71.0814… <int [<dbl 
#>  7 25025010104 ((-71.08784 42.34746, -71.08805 42.34746, -71.088… <int [<dbl 
#>  8 25025000703 ((-71.12622 42.35041, -71.12685 42.35009, -71.127… <int [<dbl 
#>  9 25025000504 ((-71.14175 42.3404, -71.14194 42.34001, -71.1423… <int [<dbl 
#> 10 25025000704 ((-71.13551 42.34878, -71.13572 42.34904, -71.135… <int [<dbl 
#> # … with 158 more rows

These can be brought over to our data context for further use.

bos_nb <- bos_nb |> 
  set_wts() |> 
  set_nbs()

bos_nb
#> spacetime ────
#> Context:`data`
#> 168 locations `.region_id`
#> 10 time periods `year`
#> ── data context ────────────────────────────────────────────────────────────────
#> # A tibble: 1,680 × 7
#>    .region_id  ecometric  year value time_period wt        nb       
#>    <chr>       <chr>     <int> <dbl> <date>      <list>    <list>   
#>  1 25025010405 Guns       2010  0.35 2010-01-01  <dbl [8]> <int [8]>
#>  2 25025010404 Guns       2010  0    2010-01-01  <dbl [3]> <int [3]>
#>  3 25025010801 Guns       2010  0    2010-01-01  <dbl [4]> <int [4]>
#>  4 25025010702 Guns       2010  0.46 2010-01-01  <dbl [5]> <int [5]>
#>  5 25025010204 Guns       2010  0    2010-01-01  <dbl [3]> <int [3]>
#>  6 25025010802 Guns       2010  0    2010-01-01  <dbl [4]> <int [4]>
#>  7 25025010104 Guns       2010  0    2010-01-01  <dbl [5]> <int [5]>
#>  8 25025000703 Guns       2010  0    2010-01-01  <dbl [3]> <int [3]>
#>  9 25025000504 Guns       2010  0.22 2010-01-01  <dbl [5]> <int [5]>
#> 10 25025000704 Guns       2010  0    2010-01-01  <dbl [4]> <int [4]>
#> # … with 1,670 more rows

Then we can group our dataset and calculate metrics based on different years. But this is only possible for data that are spacetime cubes. Read the spacetime vignette for more. We can check if this object meets the conditions to be a spacetime cube.

is_spacetime_cube(bos)
#> [1] TRUE

Since this is a spacetime cube, it is safe to perform statistics on each time slice.

bos_gs <- bos_nb |> 
  activate("data") |> 
  filter(ecometric == "Guns") |> 
  group_by(year) |> 
  mutate(g = local_g(value, nb, wt))

bos_gs
#> spacetime ────
#> Context:`data`
#> 168 locations `.region_id`
#> 10 time periods `year`
#> ── data context ────────────────────────────────────────────────────────────────
#> # A tibble: 1,680 × 8
#> # Groups:   year [10]
#>    .region_id  ecometric  year value time_period wt        nb        g         
#>  * <chr>       <chr>     <int> <dbl> <date>      <list>    <list>    <localG>  
#>  1 25025010405 Guns       2010  0.35 2010-01-01  <dbl [8]> <int [8]> -0.9870615
#>  2 25025010404 Guns       2010  0    2010-01-01  <dbl [3]> <int [3]> -0.9296040
#>  3 25025010801 Guns       2010  0    2010-01-01  <dbl [4]> <int [4]> -1.0821619
#>  4 25025010702 Guns       2010  0.46 2010-01-01  <dbl [5]> <int [5]> -1.0365653
#>  5 25025010204 Guns       2010  0    2010-01-01  <dbl [3]> <int [3]> -0.8636070
#>  6 25025010802 Guns       2010  0    2010-01-01  <dbl [4]> <int [4]> -1.0821619
#>  7 25025010104 Guns       2010  0    2010-01-01  <dbl [5]> <int [5]> -1.3275156
#>  8 25025000703 Guns       2010  0    2010-01-01  <dbl [3]> <int [3]> -1.0238855
#>  9 25025000504 Guns       2010  0.22 2010-01-01  <dbl [5]> <int [5]> -1.3431602
#> 10 25025000704 Guns       2010  0    2010-01-01  <dbl [4]> <int [4]> -1.2541524
#> # … with 1,670 more rows

Now that the measure has been calculated for each timeslice, we can connect the geometries for each time slice for the purpose of visualization.

Note that the spacetime class’ objective is to avoid geometry duplication. But it is necessary for visualization with ggplot2.

We can cast to an sf object with as_sf().

library(ggplot2)

# cast to sf object. Uses merge under the hood
# so if duplicate columns exist, we can specify suffix names
as_sf(bos_gs, suffixes = c("_geo", "_data")) |>
  ggplot(aes(fill = g)) +
  geom_sf(color = NA) +
  facet_wrap("year", nrow = 2) +
  scale_fill_gradient2(
    high = scales::muted("red"),
    low = scales::muted("blue")
    ) + 
  labs(title = "Gun ecometric hotspots",
       subtitle = "2010 - 2019") + 
  theme_void() +
  theme(legend.position = "bottom")

Note that the holes are due to missing data in the original dataset.

Point Pattern analysis

There were a few additional functions added which pertain to point pattern analysis. Namely regarding finding centers and creating ellipses based on the centers. These functions supplement gaps in other spatial analysis libraries such as spatstat.

library(sf)
#> Linking to GEOS 3.9.1, GDAL 3.2.3, PROJ 7.2.1; sf_use_s2() is TRUE
library(sfdep)

pnts <- st_sample(guerry, 250)

We can calculate the euclidean median center and plot it over all the points.

cent <- euclidean_median(pnts)

plot(pnts)
plot(cent, col = "red", pch = 17, add = TRUE)

Additionally, we can identify the standard deviational ellipse for our point set.

ellip_cent <- std_dev_ellipse(pnts)

ellip <- st_ellipse(ellip_cent, 
           ellip_cent$sx, 
           ellip_cent$sy, 
           ellip_cent$theta) 

plot(ellip)
plot(pnts, add = TRUE)

General