1. Overview

Based on dataset VAST Challenge 2022, we will explore and characterize the distinct areas of the city , and characterize the travel patterns to identify potential bottlenecks or hazards, and examine how these patterns change over time. The operation was carried out on Rstudio and main packages used are sf, tmap and tidyverse.

Questions to be addressed are:

Social areas of the city of Engagement, Ohio USA.
Visualising and analysing locations with traffic bottleneck of the city of Engagement, Ohio USA.

2. Data Preparation

2.1 Installing libraries

Before we get started, it is important for us to ensure that the required R packages have been installed. If yes, we will load the R packages. If they have yet to be installed, we will install the R packages and load them onto R environment.

sf IS an R package specially designed to handle geospatial data in simple feature objects.

The chunk code on the right will do the trick.

packages = c('sf','tmap','tidyverse','clock',
             'lubridate','sftime','rmarkdown')
for (p in packages){
  if(!require(p, character.only = T)){
    install.packages(p)
  }
}

2.2 Importing wkt data

Well-known text (WKT) is a human readable representation for spatial objects like points, lines, or enclosed areas on a map.

Import geospatial data in wkt format into R and saved the imported data as simple feature objects by using sf package

In the code chunk below, read_sf() of sf package is used to parse School.csv Pubs.csv, Apartments.csv, Buildings.csv, Employer.csv, and Restaurants.csv into R as sf data.frames.

schools <- read_sf("data/Schools.csv", 
                   options = "GEOM_POSSIBLE_NAMES=location")

apartments <- read_sf("data/Apartments.csv", 
                   options = "GEOM_POSSIBLE_NAMES=location")

buildings <- read_sf("data/Buildings.csv", 
                   options = "GEOM_POSSIBLE_NAMES=location")

employers <- read_sf("data/Employers.csv", 
                   options = "GEOM_POSSIBLE_NAMES=location")

jobs <- read_sf("data/Jobs.csv", 
                   options = "GEOM_POSSIBLE_NAMES=location")

participants <- read_sf("data/Participants.csv", 
                   options = "GEOM_POSSIBLE_NAMES=location")

pubs <- read_sf("data/Pubs.csv", 
                   options = "GEOM_POSSIBLE_NAMES=location")

restaurants <- read_sf("data/Restaurants.csv", 
                   options = "GEOM_POSSIBLE_NAMES=location")

It is always a good practice to examine the imported data frame before further analysis is performed.

Let’s take an overview of the datasets

print(buildings)

Simple feature collection with 1042 features and 4 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: -4762.191 ymin: -30.08359 xmax: 2650 ymax: 7850.037
CRS:           NA
# A tibble: 1,042 x 5
   buildingId                       location buildingType maxOccupancy
   <chr>                           <POLYGON> <chr>        <chr>       
 1 1          ((350.0639 4595.666, 390.0633~ Commercial   ""          
 2 2          ((-1926.973 2725.611, -1948.1~ Residental   "12"        
 3 3          ((685.6846 1552.131, 645.9985~ Commercial   ""          
 4 4          ((-976.7845 4542.382, -1053.2~ Commercial   ""          
 5 5          ((1259.306 3572.727, 1299.255~ Residental   "2"         
 6 6          ((478.8969 1082.484, 473.6596~ Commercial   ""          
 7 7          ((-1920.823 615.7447, -1960.8~ Residental   ""          
 8 8          ((-3302.657 5394.354, -3301.5~ Commercial   ""          
 9 9          ((-600.5789 4429.228, -495.95~ Commercial   ""          
10 10         ((-68.75908 5379.924, -28.782~ Residental   "5"         
# ... with 1,032 more rows, and 1 more variable: units <chr>

print(apartments)

Simple feature collection with 1517 features and 5 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: -4616.828 ymin: 22.16098 xmax: 2488.067 ymax: 7829.905
CRS:           NA
# A tibble: 1,517 x 6
   apartmentId rentalCost maxOccupancy numberOfRooms
   <chr>       <chr>      <chr>        <chr>        
 1 1           768.16     2            4            
 2 2           1014.55    2            1            
 3 3           1057.39    4            3            
 4 4           1259.1     4            3            
 5 5           411.5      1            4            
 6 6           859.58     3            2            
 7 7           982.11     3            4            
 8 8           980.05     4            1            
 9 9           433.45     1            3            
10 10          1104.33    3            4            
# ... with 1,507 more rows, and 2 more variables: location <POINT>,
#   buildingId <chr>

2.3 Data Wrangling

logs_selected <- read_rds("data/rds/logs_selected.rds")

print(logs_selected)

Simple feature collection with 244271 features and 13 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: -4616.828 ymin: 35.4377 xmax: 2630 ymax: 7836.546
CRS:           NA
# A tibble: 244,271 x 14
   timestamp            currentLocation participantId currentMode
 * <chr>                        <POINT> <chr>         <chr>      
 1 2022-03-01T05:~  (-1613.47 1032.372) 651           Transport  
 2 2022-03-01T05:~  (-4586.943 7246.79) 683           Transport  
 3 2022-03-01T05:~ (-4583.826 7612.941) 728           Transport  
 4 2022-03-01T05:~ (-1318.575 1217.017) 651           Transport  
 5 2022-03-01T05:~ (-4219.829 7379.372) 683           Transport  
 6 2022-03-01T05:~ (-4317.212 7382.659) 728           Transport  
 7 2022-03-01T05:~  (-1171.728 1502.45) 651           Transport  
 8 2022-03-01T05:~  (-4186.85 6980.755) 683           Transport  
 9 2022-03-01T05:~ (-4199.135 7076.865) 728           Transport  
10 2022-03-01T05:~ (-3581.037 7172.023) 619           Transport  
# ... with 244,261 more rows, and 10 more variables:
#   hungerStatus <chr>, sleepStatus <chr>, apartmentId <chr>,
#   availableBalance <chr>, jobId <chr>, financialStatus <chr>,
#   dailyFoodBudget <chr>, weeklyExtraBudget <chr>, Timestamp <dttm>,
#   day <int>

3. Visulisations and Insights

Characterize the distinct social areas of the city of Engagement, Ohio USA.

3.1.1 Building Types Map

buildingType <- tm_shape(buildings)+
tm_polygons(col = "buildingType",
           palette="Accent",
           border.col = "black",
           border.alpha = .5,
           border.lwd = 0.5)+
tm_layout(main.title = "Building Types Map",
          main.title.position = "center",
          main.title.size = 1,
          frame = FALSE)+
tm_compass(size = 2,
           position = c('right', 'top'))

buildingType

Insights

Figure above shows that there are mainly three areas for commercial use - each have one in the middle, the north and the south, and respectively surrounded by the residential areas.
We can tell that there are mainly two large school zones and respectively in the north and the south, and also two small school zone in the west.

3.1.2 Facility Map

label <- c('Restaurant', 'Pub', 'Employer', 'Apartment', 'School')
color <- c('blue', 'green', "red", 'purple', 'yellow')

facilitiesMap <- tm_shape(buildings)+
tm_polygons(col = "grey60",
           size = 1,
           border.col = "black",
           border.lwd = 1) +
tm_shape(pubs) +
  tm_dots(col = "green", size = 0.3, alpha= 0.8) +
tm_shape(restaurants) +
  tm_dots(col = "blue", size = 0.3, alpha= 0.8) +
tm_shape(schools) +
  tm_dots(col = "yellow", size = 0.3, alpha= 0.8)+
tm_shape(employers) +
  tm_dots(col = "red") +
tm_shape(apartments) +
  tm_dots(col = "purple") +
tm_add_legend(title = 'Facilities',
              type = 'symbol',
              border.col = NA,
              labels = label,
              col = color) +
tm_layout(main.title = 'Facilities Map of Engagemnt City, Ohio USA',
          main.title.size = 1,
          frame = FALSE) +
tm_compass(size = 2,
           position = c('right', 'top'))+
tm_credits('Source: VAST Challenge 2022')

facilitiesMap

Insights

The map above gives us more details of the city’s facilities’ layout.There are more restaurants and pubs in the middle and northwest of the city, so we can deduct that there might be more traffic in the weekends in those areas.
Compared with the southeast corner, there are more apartments in the northwest, so might cause more traffic in the weekdays.

tmap_arrange(buildingType, facilitiesMap, widths = c(1))

Insights

The integrated figure shows that those restaurants, pubs and employers are located in the commercial areas that mainly in the middle.
The residential zone are at the edges of the city, and the norther , the more residents density.

3.2 Traffic Situation

Where are the busiest areas in Engagement? Are there traffic bottlenecks that should be addressed?

3.2.1 General Traffic Situation

Computing the haxegons

In the code chunk below, st_make_grid() of sf package is used to create haxegons

hex <- st_make_grid(buildings, 
                    cellsize=100, 
                    square=FALSE) %>%
  st_sf() %>%
  rowid_to_column('hex_id')
plot(hex)

Performing point in polygon count

In the code chunk below, st_join() of sf package is used to count the number of event points in the hexagons.

points_in_hex <- st_join(logs_selected, 
                        hex, 
                        join=st_within) %>%
  st_set_geometry(NULL) %>%
  count(name='pointCount', hex_id)
head(points_in_hex)

# A tibble: 6 x 2
  hex_id pointCount
   <int>      <int>
1    169         35
2    212         56
3    225         21
4    226         94
5    227         22
6    228         45

Performing relational join

In the code chunk below, left_join() of dplyr package is used to perform a left-join by using hex as the target table and points_in_hex as the join table. The join ID is hex_id.

hex_combined <- hex %>%
  left_join(points_in_hex, 
            by = 'hex_id') %>%
  replace(is.na(.), 0)

Plotting the hexagon binning mapp

In the code chunk below, tmap package is used to create the hexagon binning map.

traffic <- tm_shape(hex_combined %>%
                      filter(pointCount > 0))+
  tm_fill("pointCount",
          n = 8,
          style = "quantile") +
  tm_borders(alpha = 0.1)+
  tm_layout(main.title = 'Traffic of Engagemnt City, Ohio USA',
            main.title.size = 1,
            frame = FALSE)
traffic

tmap_arrange(facilitiesMap,traffic, widths = c(1))

Insights

Map above shows that the main routes connecting the the north-west to the south area and wet to the east of the city are likely to see more traffic.
There is more traffic in places having schools, restaurant, pubs.

4. Conclusion

Well-known text (WKT) is a human readable representation for spatial objects like points, lines, or enclosed areas on a map, and helps when doing geo-spatial visualizations,

During this nexercise, we learned how to import geospatial data in wkt format into R and saved the imported data as simple feature objects by using sf package, to map geospatial data using tmap package, to process movement data by using sf and tidyverse packages,and to visualise movement data by using tmap and ggplot2 package.

Take-home Exercise 5

1. Overview

2. Data Preparation

2.1 Installing libraries

2.2 Importing wkt data

2.3 Data Wrangling

3. Visulisations and Insights

3.1 Distinct Social Areas

3.1.1 Building Types Map

3.1.2 Facility Map

3.2 Traffic Situation

3.2.1 General Traffic Situation

Computing the haxegons

Performing point in polygon count

Performing relational join

Plotting the hexagon binning mapp

4. Conclusion