Getting B.C. Open Data with R
Outline
- Use an area of interest (AOI) to define study area
- Explore the B.C. Data Catalogue
- Use the bcdata package to query and download data
B.C. Data Catalogue
The B.C. Government makes a huge amount of open data available, both spatial and non-spatial, and documents it in the B.C. Data Catalogue.
The bcdata package allows you to interact with the catalogue, and download data, directly from within R.
-
bcdc_browse()
- Open the catalogue in your default browser
-
bcdc_search()
- Search records in the catalogue
-
bcdc_get_record()
- Print a catalogue record
-
bcdc_get_data()
- Get catalogue data
-
bcdc_describe_feature()
- Describe the structure (columns and types) of a data set
-
bcdc_query_geodata()
- Get & query catalogue geospatial data available through a Web Service ]
We are going to use bcdata to get some data for our telemetry analysis:
- BEC
- VRI
- Forest Cutblocks
- Lakes and wetlands
- Streams
- Roads
Create an Area of Interest (AOI)
We’ll read in the caribou data created in the previous module, and create an area of interest (a bounding box) around it, so we can use that to spatially subset the covariate data we will be using.
scott <- read_sf("clean_data/scott_herd_subset.gpkg")
scott_bbox <- st_bbox(scott)
## Round to nearest 100m outside the box to align with raster grids
scott_bbox["xmin"] <- floor(scott_bbox["xmin"] / 100) * 100
scott_bbox["ymin"] <- floor(scott_bbox["ymin"] / 100) * 100
scott_bbox["xmax"] <- ceiling(scott_bbox["xmax"] / 100) * 100
scott_bbox["ymax"] <- ceiling(scott_bbox["ymax"] / 100) * 100
scott_aoi <- st_as_sfc(scott_bbox)
write_sf(scott_aoi, "clean_data/scott_aoi.gpkg")
First, let’s open the B.C. Data Catalogue in our browser:
bcdc_browse()
If you search for “BEC”, one of the first hits should be the BEC Map record. If you click on the “Share” ( ) button you will get a url: https://catalogue.data.gov.bc.ca/dataset/f358a53b-ffde-4830-a325-a5a03ff672c3
. The last bit of the url (f358a53b-ffde-4830-a325-a5a03ff672c3
) is the unique identifier for the record, and we can use that ID to query the dataset with bcdata.
Let’s find out about the record:
bcdc_get_record("f358a53b-ffde-4830-a325-a5a03ff672c3")
B.C. Data Catalogue Record: BEC Map
Name: bec-map (ID: f358a53b-ffde-4830-a325-a5a03ff672c3)
Permalink:
https://catalogue.data.gov.bc.ca/dataset/f358a53b-ffde-4830-a325-a5a03ff672c3
Licence: Open Government Licence - British Columbia
Description: The current and most detailed version of the approved
corporate provincial digital Biogeoclimatic Ecosystem Classification
(BEC) Zone/Subzone/Variant/Phase map (version 12, September 2, 2021).
Use this version when performing GIS analysis regardless of scale.
This mapping is deliberately extended across the ocean, lakes,
glaciers, etc to facilitate intersection with a terrestrial landcover
layer of your choice
Available Resources (1):
1. WMS getCapabilities request (wms)
Access the full 'Resources' data frame using:
bcdc_tidy_resources('f358a53b-ffde-4830-a325-a5a03ff672c3')
Query and filter this data using:
bcdc_query_geodata('f358a53b-ffde-4830-a325-a5a03ff672c3')
BEC
bcdc_query_geodata()
by itself does not download the data - it retrieves the first few records and shows them to us, as well as some helpful information about the data:
bcdc_query_geodata("f358a53b-ffde-4830-a325-a5a03ff672c3")
Querying 'bec-map' record
• Using collect() on this object will return 15666 features and 20 fields
• Accessing this record requires pagination and will make 16 separate
• requests to the WFS. See ?bcdc_options
• At most six rows of the record are printed here
────────────────────────────────────────────────────────────────────────────────
Simple feature collection with 6 features and 20 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: 575312.5 ymin: 1553588 xmax: 780612.5 ymax: 1668388
Projected CRS: NAD83 / BC Albers
# A tibble: 6 × 21
id FEATURE_CLASS_SKEY ZONE SUBZONE VARIANT PHASE NATURAL_DISTURBANCE
<chr> <int> <chr> <chr> <chr> <chr> <chr>
1 WHSE_FORES… 435 SWB uns <NA> <NA> NDT2
2 WHSE_FORES… 435 SWB uns <NA> <NA> NDT2
3 WHSE_FORES… 435 SWB uns <NA> <NA> NDT2
4 WHSE_FORES… 435 SWB uns <NA> <NA> NDT2
5 WHSE_FORES… 435 SWB uns <NA> <NA> NDT2
6 WHSE_FORES… 435 SWB uns <NA> <NA> NDT2
# ℹ 14 more variables: MAP_LABEL <chr>, BGC_LABEL <chr>, ZONE_NAME <chr>,
# SUBZONE_NAME <chr>, VARIANT_NAME <chr>, PHASE_NAME <chr>,
# NATURAL_DISTURBANCE_NAME <chr>, FEATURE_AREA_SQM <int>,
# FEATURE_LENGTH_M <int>, FEATURE_AREA <int>, FEATURE_LENGTH <int>,
# OBJECTID <int>, SE_ANNO_CAD_DATA <chr>, geometry <POLYGON [m]>
You can use dplyr
verbs filter()
and select()
to cut down the amount of data you need to download from the web service. filter()
can take logical predicates such as ==
, >
, %in%
etc., as well as geometry predicates such as INTERSECTS()
, OVERLAPS()
, WITHIN()
etc. See this vignette for details.
We can run the query without assigning it to anything, to see what effect the filter and select statements will have on what we are downloading:
bcdc_query_geodata("f358a53b-ffde-4830-a325-a5a03ff672c3") |>
filter(INTERSECTS(scott_aoi)) |>
select(MAP_LABEL)
Querying 'bec-map' record
• Using collect() on this object will return 67 features and 13 fields
• At most six rows of the record are printed here
────────────────────────────────────────────────────────────────────────────────
Simple feature collection with 6 features and 13 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: 1120500 ymin: 1203663 xmax: 1209725 ymax: 1265670
Projected CRS: NAD83 / BC Albers
# A tibble: 6 × 14
id FEATURE_CLASS_SKEY ZONE SUBZONE NATURAL_DISTURBANCE MAP_LABEL BGC_LABEL
<chr> <int> <chr> <chr> <chr> <chr> <chr>
1 WHSE… 435 ESSF wcp NDT5 ESSFwcp ESSFwcp
2 WHSE… 435 ESSF wk NDT1 ESSFwk2 ESSFwk 2
3 WHSE… 435 ESSF wc NDT1 ESSFwc3 ESSFwc 3
4 WHSE… 435 ESSF mv NDT2 ESSFmv2 ESSFmv 2
5 WHSE… 435 ESSF wcp NDT5 ESSFwcp ESSFwcp
6 WHSE… 435 ESSF wcp NDT5 ESSFwcp ESSFwcp
# ℹ 7 more variables: ZONE_NAME <chr>, SUBZONE_NAME <chr>,
# NATURAL_DISTURBANCE_NAME <chr>, FEATURE_AREA <int>, FEATURE_LENGTH <int>,
# OBJECTID <int>, geometry <POLYGON [m]>
Note that now we will be only downloading 64 features (rows) instead of the ~ 15000 in the full record.
Once your query is complete you call collect()
to tell the server to execute the query and send you the data:
bec <- bcdc_query_geodata("f358a53b-ffde-4830-a325-a5a03ff672c3") |>
filter(INTERSECTS(scott_aoi)) |>
select(MAP_LABEL) |>
collect()
mapview(bec, zcol = "MAP_LABEL") + mapview(st_as_sf(scott_aoi))
You can see that filtering using the INTERSECTS()
function does a good job of only downloading the features that intersect the AOI, but it doesn’t actually clip them to the AOI. We can do that with sf::st_intersection()
. Also, there are some columns that are “sticky” - even if you don’t select them in a select()
statement before you call collect, they come along anyway. We can do a final select()
once the data is downloaded:
bec <- st_intersection(bec, scott_aoi) |>
select(MAP_LABEL)
mapview(bec)
And now we can write our BEC data to a file to use later in the analysis.
write_sf(bec, "clean_data/bec.gpkg")
VRI
See a list of VRI codes.
vri <- bcdc_query_geodata("2ebb35d8-c82f-4a17-9c96-612ac3532d55") |>
filter(INTERSECTS(scott_aoi)) |>
select(PROJ_AGE_CLASS_CD_1, BCLCS_LEVEL_4, CROWN_CLOSURE_CLASS_CD) |>
collect() |>
st_intersection(scott_aoi)
We want to split this into separate files for different variables:
- Coniferous-leading stands
- Stands with age greater than 40 yrs (age class >=3)
- Crown closure
# Tree coniferous leading - select coniferous leading vri plots
vri_conif <- vri |>
mutate(conif = BCLCS_LEVEL_4) |>
filter(conif == "TC") |>
select(conif)
write_sf(vri_conif, "clean_data/vri_conif.gpkg")
# Age class greater than 40 years
vri_ageclass <- vri |>
mutate(age_class = as.numeric(PROJ_AGE_CLASS_CD_1)) |>
filter(age_class >= 3) |>
select(age_class)
write_sf(vri_ageclass, "clean_data/vri_ageclass.gpkg")
# Crown closure class
vri_cc <- vri |>
mutate(cc_class = as.numeric(CROWN_CLOSURE_CLASS_CD)) |>
select(cc_class)
write_sf(vri_cc, "clean_data/vri_cc.gpkg")
Cutblocks
To get the cutblocks, we filter to our AOI, and also choose those blocks that have a harvest year in the last 30 years.
We can get information about the columns in a given data set with bcdc_describe_feature()
:
bcdc_describe_feature("b1b647a6-f271-42e0-9cd0-89ec24bce9f7")
# A tibble: 13 × 5
col_name sticky remote_col_type local_col_type column_comments
<chr> <lgl> <chr> <chr> <chr>
1 id TRUE xsd:string character <NA>
2 VEG_CONSOLIDATED_CUT_B… TRUE xsd:decimal numeric VEG_CONSOLIDAT…
3 OPENING_ID FALSE xsd:decimal numeric OPENING_ID is …
4 HARVEST_YEAR FALSE xsd:decimal numeric HARVEST_YEAR i…
5 DISTURBANCE_START_DATE FALSE xsd:date date DISTURBANCE_ST…
6 DISTURBANCE_END_DATE FALSE xsd:date date DISTURBANCE_EN…
7 DATA_SOURCE FALSE xsd:string character DATA_SOURCE is…
8 AREA_HA FALSE xsd:decimal numeric AREA_HA is the…
9 FEATURE_AREA_SQM FALSE xsd:decimal numeric FEATURE_AREA_S…
10 FEATURE_LENGTH_M FALSE xsd:decimal numeric FEATURE_LENGTH…
11 SHAPE FALSE gml:GeometryPr… sfc geometry SHAPE is the c…
12 OBJECTID TRUE xsd:decimal numeric OBJECTID is a …
13 SE_ANNO_CAD_DATA FALSE xsd:hexBinary numeric SE_ANNO_CAD_DA…
cutblocks <- bcdc_query_geodata("b1b647a6-f271-42e0-9cd0-89ec24bce9f7") |>
filter(
INTERSECTS(scott_aoi),
HARVEST_YEAR >= 1993) |>
select(HARVEST_YEAR) |>
collect() |>
st_intersection(scott_aoi)
mapview(cutblocks, zcol = "HARVEST_YEAR")
write_sf(cutblocks, "clean_data/cutblocks.gpkg")
Water bodies
Searching the catalogue for the BC Freshwater Atlas
We can search the catalogue for data using keywords, with bcdc_search()
. Control the number of results returned with n
.
bcdc_search("freshwater atlas", n = 20)
List of B.C. Data Catalogue Records
Number of records: 20
Titles:
1: Freshwater Atlas Glaciers (multiple, html, pdf, wms, kml)
ID: 8f2aee65-9f4c-4f72-b54c-0937dbf3e6f7
Name: freshwater-atlas-glaciers
2: Freshwater Atlas Watersheds (multiple, html, pdf, wms, kml, fgdb)
ID: 3ee497c4-57d7-47f8-b030-2e0c03f8462a
Name: freshwater-atlas-watersheds
3: Freshwater Atlas Obstructions (multiple, html, pdf, wms, kml)
ID: 64797286-3ca5-4202-9064-a7f790321e9e
Name: freshwater-atlas-obstructions
4: Freshwater Atlas Rivers (multiple, fgdb, pdf, wms, kml)
ID: f7dac054-efbf-402f-ab62-6fc4b32a619e
Name: freshwater-atlas-rivers
5: Freshwater Atlas Lakes (multiple, html, pdf, wms, kml)
ID: cb1e3aba-d3fe-4de1-a2d4-b8b6650fb1f6
Name: freshwater-atlas-lakes
6: Freshwater Atlas Coastlines (multiple, html, pdf, wms, kml)
ID: 87b1d6a7-d4d1-4c25-a879-233becdbffed
Name: freshwater-atlas-coastlines
7: Freshwater Atlas Wetlands (multiple, html, pdf, wms, kml)
ID: 93b413d8-1840-4770-9629-641d74bd1cc6
Name: freshwater-atlas-wetlands
8: Freshwater Atlas Islands (multiple, html, pdf, wms, kml)
ID: 4483aeea-df26-4b83-a565-934c769e74de
Name: freshwater-atlas-islands
9: Freshwater Atlas Watershed Boundaries (multiple, html, pdf, wms,
kml, fgdb)
ID: ab758580-809d-4e11-bb2c-df02ac5465c9
Name: freshwater-atlas-watershed-boundaries
10: Freshwater Atlas Manmade Waterbodies (multiple, html, pdf, wms,
kml)
ID: 055fd71e-b771-4d47-a863-8a54f91a954c
Name: freshwater-atlas-manmade-waterbodies
11: Freshwater Atlas Linear Boundaries (multiple, html, pdf, wms, kml,
fgdb)
ID: 2af1388e-d5f7-46dc-a6e2-f85415ddbd1c
Name: freshwater-atlas-linear-boundaries
12: Freshwater Atlas Watershed Groups (multiple, html, pdf, wms, kml)
ID: 51f20b1a-ab75-42de-809d-bf415a0f9c62
Name: freshwater-atlas-watershed-groups
13: Freshwater Atlas Named Watersheds (multiple, html, pdf, wms, kml)
ID: ea63ea04-eab0-4b83-8729-f8a93ac688a1
Name: freshwater-atlas-named-watersheds
14: Freshwater Atlas Assessment Watersheds (multiple, html, pdf, wms,
kml)
ID: 97d8ef37-b8d2-4c3b-b772-6b25c1db13d0
Name: freshwater-atlas-assessment-watersheds
15: Freshwater Atlas Stream Directions (multiple, html, pdf, wms, kml)
ID: d7165359-52ef-41d0-b762-c53e3468ff3f
Name: freshwater-atlas-stream-directions
16: Freshwater Atlas Stream Network (multiple, html, pdf, wms, kml,
fgdb)
ID: 92344413-8035-4c08-b996-65a9b3f62fca
Name: freshwater-atlas-stream-network
17: Freshwater Atlas Edge Type Codes (multiple, html, pdf)
ID: 509cbf74-7ee7-44d3-a88d-4e088ea67325
Name: freshwater-atlas-edge-type-codes
18: Freshwater Atlas Waterbody Type Codes (multiple, html, pdf)
ID: ade4f36a-1fd4-4583-8253-2b2a1bbe34ff
Name: freshwater-atlas-waterbody-type-codes
19: Freshwater Atlas Watershed Type Codes (multiple, html, pdf)
ID: f7efa3ea-bf1c-4c4f-bb33-ba841aa076c0
Name: freshwater-atlas-watershed-type-codes
20: Freshwater Atlas Named Point Features (multiple, html, pdf, wms,
kml)
ID: db43a358-273c-4c2e-8a5c-cc28eaaffaa7
Name: freshwater-atlas-named-point-features
Access a single record by calling `bcdc_get_record(ID)` with the ID
from the desired record.
lakes <- bcdc_query_geodata("cb1e3aba-d3fe-4de1-a2d4-b8b6650fb1f6") |>
filter(INTERSECTS(scott_aoi)) |>
select(id, WATERBODY_TYPE, AREA_HA) |>
collect()
wetlands <- bcdc_query_geodata("93b413d8-1840-4770-9629-641d74bd1cc6") |>
filter(INTERSECTS(scott_aoi)) |>
select(id, WATERBODY_TYPE, AREA_HA) |>
collect()
# Combine the data sets into one, select only the columns we want, and
# clip to aoi
water <- bind_rows(lakes, wetlands) |>
select(id, WATERBODY_TYPE, AREA_HA) |>
st_intersection(scott_aoi)
write_sf(water, "clean_data/water.gpkg")
Stream Index
streams_cols <- bcdc_describe_feature("92344413-8035-4c08-b996-65a9b3f62fca")
print(streams_cols, n = 20)
# A tibble: 28 × 5
col_name sticky remote_col_type local_col_type column_comments
<chr> <lgl> <chr> <chr> <chr>
1 id TRUE xsd:string character <NA>
2 LINEAR_FEATURE_ID TRUE xsd:decimal numeric "A unique nume…
3 WATERSHED_GROUP_ID TRUE xsd:decimal numeric "An automatica…
4 EDGE_TYPE TRUE xsd:decimal numeric "A 4 digit num…
5 BLUE_LINE_KEY FALSE xsd:decimal numeric "Uniquely iden…
6 WATERSHED_KEY FALSE xsd:decimal numeric "A key that id…
7 FWA_WATERSHED_CODE FALSE xsd:string character "A 143 charact…
8 LOCAL_WATERSHED_CODE FALSE xsd:string character "The 143 chara…
9 WATERSHED_GROUP_CODE TRUE xsd:string character "The watershed…
10 DOWNSTREAM_ROUTE_MEASU… FALSE xsd:decimal numeric "The distance,…
11 LENGTH_METRE TRUE xsd:decimal numeric "The length in…
12 FEATURE_SOURCE TRUE xsd:string character "The source of…
13 GNIS_ID FALSE xsd:decimal numeric "The BCGNIS (…
14 GNIS_NAME FALSE xsd:string character "The BCGNIS (…
15 LEFT_RIGHT_TRIBUTARY FALSE xsd:string character "Describes whi…
16 STREAM_ORDER TRUE xsd:decimal numeric "The calculate…
17 STREAM_MAGNITUDE TRUE xsd:decimal numeric "The calculate…
18 WATERBODY_KEY FALSE xsd:decimal numeric "The waterbody…
19 BLUE_LINE_KEY_50K FALSE xsd:decimal numeric "The best matc…
20 WATERSHED_CODE_50K FALSE xsd:string character "The hierarchi…
# ℹ 8 more rows
Let’s use the STREAM_ORDER
column to just get streams of order 3 and 4 and still intersect with our AOI.
streams <- bcdc_query_geodata("92344413-8035-4c08-b996-65a9b3f62fca") |>
filter(
INTERSECTS(scott_aoi),
STREAM_ORDER >= 3
) |>
select(id, STREAM_ORDER) |>
collect() |>
select(id, STREAM_ORDER) |>
st_zm() |>
st_intersection(scott_aoi)
mapview(streams)
write_sf(streams, "clean_data/streams.gpkg")
Roads
roads <- bcdc_query_geodata("bb060417-b6e6-4548-b837-f9060d94743e") |>
filter(INTERSECTS(scott_aoi)) |>
select(id, ROAD_CLASS, ROAD_SURFACE) |>
collect() |>
select(ROAD_SURFACE, ROAD_CLASS) |>
st_intersection(scott_aoi) |> # clip roads so all inside aoi
st_cast("MULTILINESTRING")
write_sf(roads, "clean_data/roads.gpkg")