--- title: "Geographic Data Collection and Spatial data" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Geographic Data Collection and Spatial data} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, message = FALSE, warning = FALSE, echo = TRUE, comment = "#>" ) ``` `KoboToolbox` allows the collection of spatial data through three questions types: **geopoint**, **geotrace** and **geoshape**. **Geopoint:**
The geopoint question type captures a single geographic coordinate (latitude and longitude) including altitude and accuracy. This is useful for marking locations, such as homes, schools, or water sources. **Geotrace:**
The geotrace question type collects a series of connected geographic coordinates, forming a line. This can be used to map routes, paths, or boundaries. **Geoshape:**
A geoshape question type captures a series of geographic coordinates that form a closed polygon. This is useful for defining areas, such as land parcels, agricultural fields, or protected zones. To utilize these data types, we need to parse them into a GIS friendly format. `robotoolbox` uses [Well-Known Text (WKT)](https://libgeos.org/specifications/wkt/), a standard markup language for representing vector geometry, to represent points (`geopoint`), lines (`geotrace`) and polygons (`geoshape`). WKT is chosen for its wide compatibility with GIS software and spatial analysis packages, making it easier to integrate `KoboToolbox` data with various spatial analysis workflows. ## Spatial data The following form provides a simple demonstration of how `robotoolbox` maps spatial field types. ```{r setup, echo = FALSE} library(robotoolbox) library(dplyr) library(sf) library(mapview) mapview::mapviewOptions( fgb = FALSE, basemaps = c( "Esri.WorldImagery", "Esri.WorldShadedRelief", "OpenTopoMap", "OpenStreetMap"), layers.control.pos = "topright") ``` ```{r asset_list, echo = FALSE} l <- asset_list ``` ### Survey questions |name |type |label | |:-------------------|:--------|:------------------------------| |point |geopoint |Record a location | |point_description |text |Describe the recorded location | |line |geotrace |Record a line | |line_description |text |Describe the recorded line | |polygon |geoshape |Record a polygon | |polygon_description |text |Describe the recorded polygon | The form includes three spatial type columns: `point`, `line` and `polygon`. ### Loading the project The aforementioned form, named `Spatial data`, was uploaded to the server. You can load it from the `asset_list` of assets. ```{r, eval = FALSE} library(robotoolbox) library(dplyr) # Retrieve a list of all assets (projects) from your KoboToolbox server asset_list <- kobo_asset_list() # Filter the asset list to find the specific project and get its unique identifier (uid) uid <- filter(asset_list, name == "Spatial data") |> pull(uid) # Load the specific asset (project) using its uid asset <- kobo_asset(uid) asset ``` ```{r, echo = FALSE} asset <- asset_spatial asset ``` In this code: - `kobo_asset_list()` retrieves a list of all assets (projects) available on your `KoboToolbox` server. - `kobo_asset()` loads a specific asset (project) using its unique identifier (`uid`), allowing you to work with that particular project data and metadata. We have a single submission, where we recorded one location using a `geopoint` question, mapped a portion of a road using a `geotrace` question, and outlined a stadium using a `geoshape` question. ### Extracting the data From the assets, we can proceed to extract the submissions. ```{r, eval = FALSE} df <- kobo_data(asset) glimpse(df) ``` ```{r, echo = FALSE} df <- data_spatial glimpse(df) ``` We can see that we have all of our three columns `point`, `line` and `polygon`. For each of them, we have a corresponding WKT column. The WKT format for a point is simply `POINT (longitude latitude)`. For example, `POINT (-17.446667 14.692778)` represents a location in Dakar, Senegal. ```{r} pull(df, point) pull(df, point_wkt) ``` For `geopoint` types, `robotoolbox` also offers columns for latitude, longitude, altitude, and precision. ```{r} df |> select(starts_with("point_")) ``` The `line` column, derived from the `geotrace` question, has a corresponding `line_wkt` column. The WKT format for a line is `LINESTRING (lon1 lat1, lon2 lat2, ...)`. Each pair of coordinates represents a point along the line. For example, `LINESTRING (-17.4440 14.6937, -17.4502 14.7167)` represents a line connecting two points in Dakar. ```{r} pull(df, line) pull(df, line_wkt) ``` Lastly, `polygon_wkt` is the WKT column derived from the `geoshape` question labeled `polygon`. The WKT format for a polygon is `POLYGON ((lon1 lat1, lon2 lat2, ..., lon1 lat1))`. Note that the first and last coordinate pairs are the same, closing the polygon. For example, `POLYGON ((-17.4440 14.6937, -17.4502 14.7167, -17.4314 14.7145, -17.4440 14.6937))` represents a triangular area in Dakar. ```{r} pull(df, polygon) pull(df, polygon_wkt) ``` Now that we understand how `robotoolbox` stores spatial question types, we can convert these columns into spatial objects suitable for spatial data analysis. ## Geopoint The standard approach to manipulate spatial vector data in `R` involves using the `sf` package. `sf` stands for Simple Features and it extends a `data.frame` by adding a geometry list-column. It creates a spatially enabled `data.frame`. It provides an interface to the popular [GDAL](https://gdal.org/), [GEOS](https://libgeos.org/), [PRØJ](https://proj.org/) and [S2](https://s2geometry.io/) libraries. It can be used to efficiently manipulate and visualize spatial vector data. Creating an `sf` object from a text column that contains [`WKT`](https://libgeos.org/specifications/wkt/) characters is straightforward. The `sf::st_as_sf` function can be used to turn the `data.frame` with a `WKT` column into an `sf` object. ```{r} point_sf <- st_as_sf(data_spatial, wkt = "point_wkt", crs = 4326) mapview(point_sf) ``` In this code, `crs = 4326` specifies the Coordinate Reference System (CRS) for the spatial data. CRS 4326 refers to the WGS84 (World Geodetic System 1984) coordinate system, which is widely used in GPS and web mapping applications. It represents locations on the Earth using latitude and longitude in degrees. This is the standard CRS used by KoboToolbox for storing geographic coordinates. ## Geotrace We can also transform a `data.frame` with a column from a `geotrace` question to an `sf` object with a `LINESTRING` geometry. The `WKT` column is named `line_wkt`. ```{r} line_sf <- st_as_sf(data_spatial, wkt = "line_wkt", crs = 4326) mapview(line_sf) ``` ## Geoshape The column `polygon_wkt` can be used to create an `sf` polygon object. It's a simple closed polygon. ```{r} poly_sf <- st_as_sf(data_spatial, wkt = "polygon_wkt", crs = 4326) mapview(poly_sf) ``` ## Conclusion By combining `robotoolbox` with R spatial analysis tools, researchers and data analysts can efficiently process, analyze, and visualize geographic data collected through `KoboToolbox`, opening up a wide range of possibilities for spatial data analysis in various fields such as humanitarian work, environmental studies, and social sciences. You can learn a lot about the `sf` packages and spatial data analysis with R from the excellent [Geocomputation with R](https://r.geocompx.org/) book and through the extensive `sf` [package documentation](https://r-spatial.github.io/sf/).