Visualising and analysing social areas and locations with traffic bottleneck of the city of Engagement, Ohio USA.
Based on dataset VAST Challenge 2022, we will explore and characterize the distinct areas of the city , and characterize the travel patterns to identify potential bottlenecks or hazards, and examine how these patterns change over time. The operation was carried out on Rstudio and main packages used are sf, tmap and tidyverse.
Questions to be addressed are:
Before we get started, it is important for us to ensure that the required R packages have been installed. If yes, we will load the R packages. If they have yet to be installed, we will install the R packages and load them onto R environment.
sf IS an R package specially designed to handle geospatial data in simple feature objects.
The chunk code on the right will do the trick.
packages = c('sf','tmap','tidyverse','clock',
'lubridate','sftime','rmarkdown')
for (p in packages){
if(!require(p, character.only = T)){
install.packages(p)
}
}
Well-known text (WKT) is a human readable representation for spatial objects like points, lines, or enclosed areas on a map.
Import geospatial data in wkt format into R and saved the imported data as simple feature objects by using sf package
In the code chunk below, read_sf() of sf package is used to parse School.csv Pubs.csv, Apartments.csv, Buildings.csv, Employer.csv, and Restaurants.csv into R as sf data.frames.
schools <- read_sf("data/Schools.csv",
options = "GEOM_POSSIBLE_NAMES=location")
apartments <- read_sf("data/Apartments.csv",
options = "GEOM_POSSIBLE_NAMES=location")
buildings <- read_sf("data/Buildings.csv",
options = "GEOM_POSSIBLE_NAMES=location")
employers <- read_sf("data/Employers.csv",
options = "GEOM_POSSIBLE_NAMES=location")
jobs <- read_sf("data/Jobs.csv",
options = "GEOM_POSSIBLE_NAMES=location")
participants <- read_sf("data/Participants.csv",
options = "GEOM_POSSIBLE_NAMES=location")
pubs <- read_sf("data/Pubs.csv",
options = "GEOM_POSSIBLE_NAMES=location")
restaurants <- read_sf("data/Restaurants.csv",
options = "GEOM_POSSIBLE_NAMES=location")
It is always a good practice to examine the imported data frame before further analysis is performed.
Let’s take an overview of the datasets
print(buildings)
Simple feature collection with 1042 features and 4 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: -4762.191 ymin: -30.08359 xmax: 2650 ymax: 7850.037
CRS: NA
# A tibble: 1,042 x 5
buildingId location buildingType maxOccupancy
<chr> <POLYGON> <chr> <chr>
1 1 ((350.0639 4595.666, 390.0633~ Commercial ""
2 2 ((-1926.973 2725.611, -1948.1~ Residental "12"
3 3 ((685.6846 1552.131, 645.9985~ Commercial ""
4 4 ((-976.7845 4542.382, -1053.2~ Commercial ""
5 5 ((1259.306 3572.727, 1299.255~ Residental "2"
6 6 ((478.8969 1082.484, 473.6596~ Commercial ""
7 7 ((-1920.823 615.7447, -1960.8~ Residental ""
8 8 ((-3302.657 5394.354, -3301.5~ Commercial ""
9 9 ((-600.5789 4429.228, -495.95~ Commercial ""
10 10 ((-68.75908 5379.924, -28.782~ Residental "5"
# ... with 1,032 more rows, and 1 more variable: units <chr>
print(apartments)
Simple feature collection with 1517 features and 5 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: -4616.828 ymin: 22.16098 xmax: 2488.067 ymax: 7829.905
CRS: NA
# A tibble: 1,517 x 6
apartmentId rentalCost maxOccupancy numberOfRooms
<chr> <chr> <chr> <chr>
1 1 768.16 2 4
2 2 1014.55 2 1
3 3 1057.39 4 3
4 4 1259.1 4 3
5 5 411.5 1 4
6 6 859.58 3 2
7 7 982.11 3 4
8 8 980.05 4 1
9 9 433.45 1 3
10 10 1104.33 3 4
# ... with 1,507 more rows, and 2 more variables: location <POINT>,
# buildingId <chr>
logs_selected <- read_rds("data/rds/logs_selected.rds")
print(logs_selected)
Simple feature collection with 244271 features and 13 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: -4616.828 ymin: 35.4377 xmax: 2630 ymax: 7836.546
CRS: NA
# A tibble: 244,271 x 14
timestamp currentLocation participantId currentMode
* <chr> <POINT> <chr> <chr>
1 2022-03-01T05:~ (-1613.47 1032.372) 651 Transport
2 2022-03-01T05:~ (-4586.943 7246.79) 683 Transport
3 2022-03-01T05:~ (-4583.826 7612.941) 728 Transport
4 2022-03-01T05:~ (-1318.575 1217.017) 651 Transport
5 2022-03-01T05:~ (-4219.829 7379.372) 683 Transport
6 2022-03-01T05:~ (-4317.212 7382.659) 728 Transport
7 2022-03-01T05:~ (-1171.728 1502.45) 651 Transport
8 2022-03-01T05:~ (-4186.85 6980.755) 683 Transport
9 2022-03-01T05:~ (-4199.135 7076.865) 728 Transport
10 2022-03-01T05:~ (-3581.037 7172.023) 619 Transport
# ... with 244,261 more rows, and 10 more variables:
# hungerStatus <chr>, sleepStatus <chr>, apartmentId <chr>,
# availableBalance <chr>, jobId <chr>, financialStatus <chr>,
# dailyFoodBudget <chr>, weeklyExtraBudget <chr>, Timestamp <dttm>,
# day <int>
Characterize the distinct social areas of the city of Engagement, Ohio USA.
buildingType <- tm_shape(buildings)+
tm_polygons(col = "buildingType",
palette="Accent",
border.col = "black",
border.alpha = .5,
border.lwd = 0.5)+
tm_layout(main.title = "Building Types Map",
main.title.position = "center",
main.title.size = 1,
frame = FALSE)+
tm_compass(size = 2,
position = c('right', 'top'))
buildingType
Insights
Figure above shows that there are mainly three areas for commercial use - each have one in the middle, the north and the south, and respectively surrounded by the residential areas.
We can tell that there are mainly two large school zones and respectively in the north and the south, and also two small school zone in the west.
label <- c('Restaurant', 'Pub', 'Employer', 'Apartment', 'School')
color <- c('blue', 'green', "red", 'purple', 'yellow')
facilitiesMap <- tm_shape(buildings)+
tm_polygons(col = "grey60",
size = 1,
border.col = "black",
border.lwd = 1) +
tm_shape(pubs) +
tm_dots(col = "green", size = 0.3, alpha= 0.8) +
tm_shape(restaurants) +
tm_dots(col = "blue", size = 0.3, alpha= 0.8) +
tm_shape(schools) +
tm_dots(col = "yellow", size = 0.3, alpha= 0.8)+
tm_shape(employers) +
tm_dots(col = "red") +
tm_shape(apartments) +
tm_dots(col = "purple") +
tm_add_legend(title = 'Facilities',
type = 'symbol',
border.col = NA,
labels = label,
col = color) +
tm_layout(main.title = 'Facilities Map of Engagemnt City, Ohio USA',
main.title.size = 1,
frame = FALSE) +
tm_compass(size = 2,
position = c('right', 'top'))+
tm_credits('Source: VAST Challenge 2022')
facilitiesMap
Insights
The map above gives us more details of the city’s facilities’ layout.There are more restaurants and pubs in the middle and northwest of the city, so we can deduct that there might be more traffic in the weekends in those areas.
Compared with the southeast corner, there are more apartments in the northwest, so might cause more traffic in the weekdays.
tmap_arrange(buildingType, facilitiesMap, widths = c(1))
Insights
The integrated figure shows that those restaurants, pubs and employers are located in the commercial areas that mainly in the middle.
The residential zone are at the edges of the city, and the norther , the more residents density.
Where are the busiest areas in Engagement? Are there traffic bottlenecks that should be addressed?
In the code chunk below, st_make_grid() of sf package is used to create haxegons
hex <- st_make_grid(buildings,
cellsize=100,
square=FALSE) %>%
st_sf() %>%
rowid_to_column('hex_id')
plot(hex)
In the code chunk below, st_join() of sf package is used to count the number of event points in the hexagons.
points_in_hex <- st_join(logs_selected,
hex,
join=st_within) %>%
st_set_geometry(NULL) %>%
count(name='pointCount', hex_id)
head(points_in_hex)
# A tibble: 6 x 2
hex_id pointCount
<int> <int>
1 169 35
2 212 56
3 225 21
4 226 94
5 227 22
6 228 45
In the code chunk below, left_join() of dplyr package is used to perform a left-join by using hex as the target table and points_in_hex as the join table. The join ID is hex_id.
In the code chunk below, tmap package is used to create the hexagon binning map.
traffic <- tm_shape(hex_combined %>%
filter(pointCount > 0))+
tm_fill("pointCount",
n = 8,
style = "quantile") +
tm_borders(alpha = 0.1)+
tm_layout(main.title = 'Traffic of Engagemnt City, Ohio USA',
main.title.size = 1,
frame = FALSE)
traffic
tmap_arrange(facilitiesMap,traffic, widths = c(1))
Insights
Map above shows that the main routes connecting the the north-west to the south area and wet to the east of the city are likely to see more traffic.
There is more traffic in places having schools, restaurant, pubs.
Well-known text (WKT) is a human readable representation for spatial objects like points, lines, or enclosed areas on a map, and helps when doing geo-spatial visualizations,
During this nexercise, we learned how to import geospatial data in wkt format into R and saved the imported data as simple feature objects by using sf package, to map geospatial data using tmap package, to process movement data by using sf and tidyverse packages,and to visualise movement data by using tmap and ggplot2 package.