Getting B.C. Open Data with R

Outline

  • Use an area of interest (AOI) to define study area
  • Explore the B.C. Data Catalogue
  • Use the bcdata package to query and download data

B.C. Data Catalogue

The B.C. Government makes a huge amount of open data available, both spatial and non-spatial, and documents it in the B.C. Data Catalogue.

The bcdata package allows you to interact with the catalogue, and download data, directly from within R.

  • bcdc_browse()
    • Open the catalogue in your default browser
  • bcdc_search()
    • Search records in the catalogue
  • bcdc_get_record()
    • Print a catalogue record
  • bcdc_get_data()
    • Get catalogue data
  • bcdc_describe_feature()
    • Describe the structure (columns and types) of a data set
  • bcdc_query_geodata()
    • Get & query catalogue geospatial data available through a Web Service ]

We are going to use bcdata to get some data for our telemetry analysis:

  1. BEC
  2. VRI
  3. Forest Cutblocks
  4. Lakes and wetlands
  5. Streams
  6. Roads

Create an Area of Interest (AOI)

We’ll read in the caribou data created in the previous module, and create an area of interest (a bounding box) around it, so we can use that to spatially subset the covariate data we will be using.

scott <- read_sf("clean_data/scott_herd_subset.gpkg")

scott_bbox <- st_bbox(scott)

## Round to nearest 100m outside the box to align with raster grids
scott_bbox["xmin"] <- floor(scott_bbox["xmin"] / 100) * 100
scott_bbox["ymin"] <- floor(scott_bbox["ymin"] / 100) * 100
scott_bbox["xmax"] <- ceiling(scott_bbox["xmax"] / 100) * 100
scott_bbox["ymax"] <- ceiling(scott_bbox["ymax"] / 100) * 100

scott_aoi <- st_as_sfc(scott_bbox)
write_sf(scott_aoi, "clean_data/scott_aoi.gpkg")

First, let’s open the B.C. Data Catalogue in our browser:

bcdc_browse()

If you search for “BEC”, one of the first hits should be the BEC Map record. If you click on the “Share” ( ) button you will get a url: https://catalogue.data.gov.bc.ca/dataset/f358a53b-ffde-4830-a325-a5a03ff672c3. The last bit of the url (f358a53b-ffde-4830-a325-a5a03ff672c3) is the unique identifier for the record, and we can use that ID to query the dataset with bcdata.

Let’s find out about the record:

bcdc_get_record("f358a53b-ffde-4830-a325-a5a03ff672c3")
B.C. Data Catalogue Record: BEC Map
Name: bec-map (ID: f358a53b-ffde-4830-a325-a5a03ff672c3)
Permalink:
 https://catalogue.data.gov.bc.ca/dataset/f358a53b-ffde-4830-a325-a5a03ff672c3
Licence: Open Government Licence - British Columbia
Description: The current and most detailed version of the approved
 corporate provincial digital Biogeoclimatic Ecosystem Classification
 (BEC) Zone/Subzone/Variant/Phase map (version 12, September 2, 2021).
 Use this version when performing GIS analysis regardless of scale.
 This mapping is deliberately extended across the ocean, lakes,
 glaciers, etc to facilitate intersection with a terrestrial landcover
 layer of your choice
Available Resources (1):
 1. WMS getCapabilities request (wms)
Access the full 'Resources' data frame using:
 bcdc_tidy_resources('f358a53b-ffde-4830-a325-a5a03ff672c3')
Query and filter this data using:
 bcdc_query_geodata('f358a53b-ffde-4830-a325-a5a03ff672c3')

BEC

bcdc_query_geodata() by itself does not download the data - it retrieves the first few records and shows them to us, as well as some helpful information about the data:

bcdc_query_geodata("f358a53b-ffde-4830-a325-a5a03ff672c3")
Querying 'bec-map' record
• Using collect() on this object will return 15666 features and 20 fields
• Accessing this record requires pagination and will make 16 separate
• requests to the WFS. See ?bcdc_options
• At most six rows of the record are printed here
────────────────────────────────────────────────────────────────────────────────
Simple feature collection with 6 features and 20 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: 575312.5 ymin: 1553588 xmax: 780612.5 ymax: 1668388
Projected CRS: NAD83 / BC Albers
# A tibble: 6 × 21
  id          FEATURE_CLASS_SKEY ZONE  SUBZONE VARIANT PHASE NATURAL_DISTURBANCE
  <chr>                    <int> <chr> <chr>   <chr>   <chr> <chr>              
1 WHSE_FORES…                435 SWB   uns     <NA>    <NA>  NDT2               
2 WHSE_FORES…                435 SWB   uns     <NA>    <NA>  NDT2               
3 WHSE_FORES…                435 SWB   uns     <NA>    <NA>  NDT2               
4 WHSE_FORES…                435 SWB   uns     <NA>    <NA>  NDT2               
5 WHSE_FORES…                435 SWB   uns     <NA>    <NA>  NDT2               
6 WHSE_FORES…                435 SWB   uns     <NA>    <NA>  NDT2               
# ℹ 14 more variables: MAP_LABEL <chr>, BGC_LABEL <chr>, ZONE_NAME <chr>,
#   SUBZONE_NAME <chr>, VARIANT_NAME <chr>, PHASE_NAME <chr>,
#   NATURAL_DISTURBANCE_NAME <chr>, FEATURE_AREA_SQM <int>,
#   FEATURE_LENGTH_M <int>, FEATURE_AREA <int>, FEATURE_LENGTH <int>,
#   OBJECTID <int>, SE_ANNO_CAD_DATA <chr>, geometry <POLYGON [m]>

You can use dplyr verbs filter() and select() to cut down the amount of data you need to download from the web service. filter() can take logical predicates such as ==, >, %in% etc., as well as geometry predicates such as INTERSECTS(), OVERLAPS(), WITHIN() etc. See this vignette for details.

We can run the query without assigning it to anything, to see what effect the filter and select statements will have on what we are downloading:

bcdc_query_geodata("f358a53b-ffde-4830-a325-a5a03ff672c3") |>
    filter(INTERSECTS(scott_aoi)) |> 
    select(MAP_LABEL)
Querying 'bec-map' record
• Using collect() on this object will return 67 features and 13 fields
• At most six rows of the record are printed here
────────────────────────────────────────────────────────────────────────────────
Simple feature collection with 6 features and 13 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: 1120500 ymin: 1203663 xmax: 1209725 ymax: 1265670
Projected CRS: NAD83 / BC Albers
# A tibble: 6 × 14
  id    FEATURE_CLASS_SKEY ZONE  SUBZONE NATURAL_DISTURBANCE MAP_LABEL BGC_LABEL
  <chr>              <int> <chr> <chr>   <chr>               <chr>     <chr>    
1 WHSE…                435 ESSF  wcp     NDT5                ESSFwcp   ESSFwcp  
2 WHSE…                435 ESSF  wk      NDT1                ESSFwk2   ESSFwk 2 
3 WHSE…                435 ESSF  wc      NDT1                ESSFwc3   ESSFwc 3 
4 WHSE…                435 ESSF  mv      NDT2                ESSFmv2   ESSFmv 2 
5 WHSE…                435 ESSF  wcp     NDT5                ESSFwcp   ESSFwcp  
6 WHSE…                435 ESSF  wcp     NDT5                ESSFwcp   ESSFwcp  
# ℹ 7 more variables: ZONE_NAME <chr>, SUBZONE_NAME <chr>,
#   NATURAL_DISTURBANCE_NAME <chr>, FEATURE_AREA <int>, FEATURE_LENGTH <int>,
#   OBJECTID <int>, geometry <POLYGON [m]>

Note that now we will be only downloading 64 features (rows) instead of the ~ 15000 in the full record.

Once your query is complete you call collect() to tell the server to execute the query and send you the data:

bec <- bcdc_query_geodata("f358a53b-ffde-4830-a325-a5a03ff672c3") |>
    filter(INTERSECTS(scott_aoi)) |> 
    select(MAP_LABEL) |> 
    collect()

mapview(bec, zcol = "MAP_LABEL") + mapview(st_as_sf(scott_aoi))

You can see that filtering using the INTERSECTS() function does a good job of only downloading the features that intersect the AOI, but it doesn’t actually clip them to the AOI. We can do that with sf::st_intersection(). Also, there are some columns that are “sticky” - even if you don’t select them in a select() statement before you call collect, they come along anyway. We can do a final select() once the data is downloaded:

bec <- st_intersection(bec, scott_aoi) |> 
  select(MAP_LABEL)

mapview(bec)

And now we can write our BEC data to a file to use later in the analysis.

write_sf(bec, "clean_data/bec.gpkg")

VRI

See a list of VRI codes.

vri <- bcdc_query_geodata("2ebb35d8-c82f-4a17-9c96-612ac3532d55") |> 
  filter(INTERSECTS(scott_aoi)) |> 
  select(PROJ_AGE_CLASS_CD_1, BCLCS_LEVEL_4, CROWN_CLOSURE_CLASS_CD) |>  
  collect() |> 
  st_intersection(scott_aoi)

We want to split this into separate files for different variables:

  • Coniferous-leading stands
  • Stands with age greater than 40 yrs (age class >=3)
  • Crown closure
# Tree coniferous leading - select coniferous leading vri plots
vri_conif <- vri |>  
    mutate(conif = BCLCS_LEVEL_4) |> 
    filter(conif == "TC") |> 
    select(conif)
write_sf(vri_conif, "clean_data/vri_conif.gpkg")
# Age class greater than 40 years
vri_ageclass <- vri |> 
    mutate(age_class = as.numeric(PROJ_AGE_CLASS_CD_1)) |> 
    filter(age_class >= 3) |> 
    select(age_class)
write_sf(vri_ageclass, "clean_data/vri_ageclass.gpkg")
# Crown closure class 
vri_cc <- vri |> 
    mutate(cc_class = as.numeric(CROWN_CLOSURE_CLASS_CD)) |> 
    select(cc_class)
write_sf(vri_cc, "clean_data/vri_cc.gpkg")

Cutblocks

To get the cutblocks, we filter to our AOI, and also choose those blocks that have a harvest year in the last 30 years.

We can get information about the columns in a given data set with bcdc_describe_feature():

bcdc_describe_feature("b1b647a6-f271-42e0-9cd0-89ec24bce9f7")
# A tibble: 13 × 5
   col_name                sticky remote_col_type local_col_type column_comments
   <chr>                   <lgl>  <chr>           <chr>          <chr>          
 1 id                      TRUE   xsd:string      character      <NA>           
 2 VEG_CONSOLIDATED_CUT_B… TRUE   xsd:decimal     numeric        VEG_CONSOLIDAT…
 3 OPENING_ID              FALSE  xsd:decimal     numeric        OPENING_ID is …
 4 HARVEST_YEAR            FALSE  xsd:decimal     numeric        HARVEST_YEAR i…
 5 DISTURBANCE_START_DATE  FALSE  xsd:date        date           DISTURBANCE_ST…
 6 DISTURBANCE_END_DATE    FALSE  xsd:date        date           DISTURBANCE_EN…
 7 DATA_SOURCE             FALSE  xsd:string      character      DATA_SOURCE is…
 8 AREA_HA                 FALSE  xsd:decimal     numeric        AREA_HA is the…
 9 FEATURE_AREA_SQM        FALSE  xsd:decimal     numeric        FEATURE_AREA_S…
10 FEATURE_LENGTH_M        FALSE  xsd:decimal     numeric        FEATURE_LENGTH…
11 SHAPE                   FALSE  gml:GeometryPr… sfc geometry   SHAPE is the c…
12 OBJECTID                TRUE   xsd:decimal     numeric        OBJECTID is a …
13 SE_ANNO_CAD_DATA        FALSE  xsd:hexBinary   numeric        SE_ANNO_CAD_DA…
cutblocks <- bcdc_query_geodata("b1b647a6-f271-42e0-9cd0-89ec24bce9f7") |>
  filter(
    INTERSECTS(scott_aoi), 
    HARVEST_YEAR >= 1993) |>
  select(HARVEST_YEAR) |>
  collect() |> 
  st_intersection(scott_aoi)

mapview(cutblocks, zcol = "HARVEST_YEAR")
write_sf(cutblocks, "clean_data/cutblocks.gpkg")

Water bodies

Searching the catalogue for the BC Freshwater Atlas

We can search the catalogue for data using keywords, with bcdc_search(). Control the number of results returned with n.

bcdc_search("freshwater atlas", n = 20)
List of B.C. Data Catalogue Records
Number of records: 20
Titles:
1: Freshwater Atlas Glaciers (multiple, html, pdf, wms, kml)
 ID: 8f2aee65-9f4c-4f72-b54c-0937dbf3e6f7
 Name: freshwater-atlas-glaciers
2: Freshwater Atlas Watersheds (multiple, html, pdf, wms, kml, fgdb)
 ID: 3ee497c4-57d7-47f8-b030-2e0c03f8462a
 Name: freshwater-atlas-watersheds
3: Freshwater Atlas Obstructions (multiple, html, pdf, wms, kml)
 ID: 64797286-3ca5-4202-9064-a7f790321e9e
 Name: freshwater-atlas-obstructions
4: Freshwater Atlas Rivers (multiple, fgdb, pdf, wms, kml)
 ID: f7dac054-efbf-402f-ab62-6fc4b32a619e
 Name: freshwater-atlas-rivers
5: Freshwater Atlas Lakes (multiple, html, pdf, wms, kml)
 ID: cb1e3aba-d3fe-4de1-a2d4-b8b6650fb1f6
 Name: freshwater-atlas-lakes
6: Freshwater Atlas Coastlines (multiple, html, pdf, wms, kml)
 ID: 87b1d6a7-d4d1-4c25-a879-233becdbffed
 Name: freshwater-atlas-coastlines
7: Freshwater Atlas Wetlands (multiple, html, pdf, wms, kml)
 ID: 93b413d8-1840-4770-9629-641d74bd1cc6
 Name: freshwater-atlas-wetlands
8: Freshwater Atlas Islands (multiple, html, pdf, wms, kml)
 ID: 4483aeea-df26-4b83-a565-934c769e74de
 Name: freshwater-atlas-islands
9: Freshwater Atlas Watershed Boundaries (multiple, html, pdf, wms,
 kml, fgdb)
 ID: ab758580-809d-4e11-bb2c-df02ac5465c9
 Name: freshwater-atlas-watershed-boundaries
10: Freshwater Atlas Manmade Waterbodies (multiple, html, pdf, wms,
 kml)
 ID: 055fd71e-b771-4d47-a863-8a54f91a954c
 Name: freshwater-atlas-manmade-waterbodies
11: Freshwater Atlas Linear Boundaries (multiple, html, pdf, wms, kml,
 fgdb)
 ID: 2af1388e-d5f7-46dc-a6e2-f85415ddbd1c
 Name: freshwater-atlas-linear-boundaries
12: Freshwater Atlas Watershed Groups (multiple, html, pdf, wms, kml)
 ID: 51f20b1a-ab75-42de-809d-bf415a0f9c62
 Name: freshwater-atlas-watershed-groups
13: Freshwater Atlas Named Watersheds (multiple, html, pdf, wms, kml)
 ID: ea63ea04-eab0-4b83-8729-f8a93ac688a1
 Name: freshwater-atlas-named-watersheds
14: Freshwater Atlas Assessment Watersheds (multiple, html, pdf, wms,
 kml)
 ID: 97d8ef37-b8d2-4c3b-b772-6b25c1db13d0
 Name: freshwater-atlas-assessment-watersheds
15: Freshwater Atlas Stream Directions (multiple, html, pdf, wms, kml)
 ID: d7165359-52ef-41d0-b762-c53e3468ff3f
 Name: freshwater-atlas-stream-directions
16: Freshwater Atlas Stream Network (multiple, html, pdf, wms, kml,
 fgdb)
 ID: 92344413-8035-4c08-b996-65a9b3f62fca
 Name: freshwater-atlas-stream-network
17: Freshwater Atlas Edge Type Codes (multiple, html, pdf)
 ID: 509cbf74-7ee7-44d3-a88d-4e088ea67325
 Name: freshwater-atlas-edge-type-codes
18: Freshwater Atlas Waterbody Type Codes (multiple, html, pdf)
 ID: ade4f36a-1fd4-4583-8253-2b2a1bbe34ff
 Name: freshwater-atlas-waterbody-type-codes
19: Freshwater Atlas Watershed Type Codes (multiple, html, pdf)
 ID: f7efa3ea-bf1c-4c4f-bb33-ba841aa076c0
 Name: freshwater-atlas-watershed-type-codes
20: Freshwater Atlas Named Point Features (multiple, html, pdf, wms,
 kml)
 ID: db43a358-273c-4c2e-8a5c-cc28eaaffaa7
 Name: freshwater-atlas-named-point-features

Access a single record by calling `bcdc_get_record(ID)` with the ID
 from the desired record.
lakes <- bcdc_query_geodata("cb1e3aba-d3fe-4de1-a2d4-b8b6650fb1f6") |>
  filter(INTERSECTS(scott_aoi)) |>
  select(id, WATERBODY_TYPE, AREA_HA) |>
  collect()

wetlands <- bcdc_query_geodata("93b413d8-1840-4770-9629-641d74bd1cc6") |>
  filter(INTERSECTS(scott_aoi)) |>
  select(id, WATERBODY_TYPE, AREA_HA) |>
  collect()

# Combine the data sets into one, select only the columns we want, and 
# clip to aoi
water <- bind_rows(lakes, wetlands) |> 
  select(id, WATERBODY_TYPE, AREA_HA) |> 
  st_intersection(scott_aoi)
write_sf(water, "clean_data/water.gpkg")

Stream Index

streams_cols <- bcdc_describe_feature("92344413-8035-4c08-b996-65a9b3f62fca")
print(streams_cols, n = 20)
# A tibble: 28 × 5
   col_name                sticky remote_col_type local_col_type column_comments
   <chr>                   <lgl>  <chr>           <chr>          <chr>          
 1 id                      TRUE   xsd:string      character       <NA>          
 2 LINEAR_FEATURE_ID       TRUE   xsd:decimal     numeric        "A unique nume…
 3 WATERSHED_GROUP_ID      TRUE   xsd:decimal     numeric        "An automatica…
 4 EDGE_TYPE               TRUE   xsd:decimal     numeric        "A 4 digit num…
 5 BLUE_LINE_KEY           FALSE  xsd:decimal     numeric        "Uniquely iden…
 6 WATERSHED_KEY           FALSE  xsd:decimal     numeric        "A key that id…
 7 FWA_WATERSHED_CODE      FALSE  xsd:string      character      "A 143 charact…
 8 LOCAL_WATERSHED_CODE    FALSE  xsd:string      character      "The 143 chara…
 9 WATERSHED_GROUP_CODE    TRUE   xsd:string      character      "The watershed…
10 DOWNSTREAM_ROUTE_MEASU… FALSE  xsd:decimal     numeric        "The distance,…
11 LENGTH_METRE            TRUE   xsd:decimal     numeric        "The length in…
12 FEATURE_SOURCE          TRUE   xsd:string      character      "The source of…
13 GNIS_ID                 FALSE  xsd:decimal     numeric        "The BCGNIS  (…
14 GNIS_NAME               FALSE  xsd:string      character      "The BCGNIS  (…
15 LEFT_RIGHT_TRIBUTARY    FALSE  xsd:string      character      "Describes whi…
16 STREAM_ORDER            TRUE   xsd:decimal     numeric        "The calculate…
17 STREAM_MAGNITUDE        TRUE   xsd:decimal     numeric        "The calculate…
18 WATERBODY_KEY           FALSE  xsd:decimal     numeric        "The waterbody…
19 BLUE_LINE_KEY_50K       FALSE  xsd:decimal     numeric        "The best matc…
20 WATERSHED_CODE_50K      FALSE  xsd:string      character      "The hierarchi…
# ℹ 8 more rows

Let’s use the STREAM_ORDER column to just get streams of order 3 and 4 and still intersect with our AOI.

streams <- bcdc_query_geodata("92344413-8035-4c08-b996-65a9b3f62fca") |>
  filter(
    INTERSECTS(scott_aoi), 
    STREAM_ORDER >= 3
  ) |>
  select(id, STREAM_ORDER) |>
  collect() |> 
  select(id, STREAM_ORDER) |>
  st_zm() |> 
  st_intersection(scott_aoi)

mapview(streams)
write_sf(streams, "clean_data/streams.gpkg")

Roads

roads <- bcdc_query_geodata("bb060417-b6e6-4548-b837-f9060d94743e") |> 
  filter(INTERSECTS(scott_aoi))  |> 
  select(id, ROAD_CLASS, ROAD_SURFACE) |> 
  collect() |> 
  select(ROAD_SURFACE, ROAD_CLASS) |> 
  st_intersection(scott_aoi) |> # clip roads so all inside aoi
  st_cast("MULTILINESTRING")
write_sf(roads, "clean_data/roads.gpkg")