Introduction

We will practice some visualisation techniques related to geolocation

Maps

To create a map in ggplot2, we need to retrieve map data from the maps package and draw it with geom_polygon(). By default, the latitude and longitude will be drawn on a Cartesian coordinate plane, but we can use coord_map() function to specify a projection.

Let’s start with drawing a world map.

# load tidyverse package
library(tidyverse)
# load maps package to get the map data
library(maps)

world <- map_data("world")

ggplot(world, aes(x = long, y = lat, group = group)) +
  geom_polygon(fill = "gray90", colour = "gray30", size = 0.25) +
  theme_void() 

We can draw a specific country as follows:

# get map data for UK
uk <- map_data("world", region = "UK")

ggplot(uk, aes(x = long, y = lat, group = group)) +
  geom_polygon(fill = "white", colour = "black") +
  coord_map("mercator")

Choropleth

For this demo, we will use USArrests datasets. For the details on the dataset, run ?USArrests.

# using USArrests datasets
crimes <- data.frame(USArrests) %>% 
  mutate(state = tolower(rownames(USArrests)))

# get a map by states
states_map <- map_data("state")

# merge datasets together
crime_map <- states_map %>% 
  left_join(crimes, by = c("region" = "state")) %>% 
  arrange(group, order)

ggplot(data = crime_map, aes(x=long, y=lat, group = group, fill = Assault))+
  geom_polygon(colour = "black")+
  coord_map("polyconic")

Let’s try a different colour scheme that is perhaps more intuitive.

# Let's use colour palette from RColorBrewer pacakge
library(RColorBrewer)

ggplot(data = crime_map, aes(x=long, y=lat, group = group, fill = Assault))+
  geom_polygon(colour = "black")+
  coord_map("polyconic") +
  scale_fill_gradientn(colors = brewer.pal(8,"Reds")) +
  theme_void() # get rid of background

Proportional Symbol Map

Instead of using the colour to encode the value, let’s draw circles and scale its area to encode the values.

# find a point per state to draw circles
summary <- crime_map %>% 
  group_by(region) %>% 
  summarise(mean_long = mean(long),
            mean_lat = mean(lat),
            Assault = first(Assault))

# first draw a background map, then draw the circles on top

ggplot()+
  geom_polygon(data = crime_map, aes(x=long, y=lat, group = group), fill = "gray90", colour = "gray30", size = 0.25)+
  theme_void() +
  geom_point(data = summary, aes(x = mean_long, y = mean_lat, size = Assault), color = "red", alpha = 0.6) +
  scale_size_area(max_size = 10)

Task 1

Answer

# ?USArrests

crimes <- data.frame(USArrests) %>% 
  mutate(state = tolower(rownames(USArrests)))

# get a map by states
states_map <- map_data("state")

# merge datasets together
crime_map <- states_map %>% 
  left_join(crimes, by = c("region" = "state")) %>% 
  arrange(group, order)


library(RColorBrewer)

# display.brewer.all()

ggplot(data = crime_map, aes(x=long, y=lat, group = group, fill = UrbanPop))+
  geom_polygon(colour = "black")+
  coord_map("polyconic") +
  scale_fill_gradientn(colors = brewer.pal(8,"Greens")) +
  theme_void() # get rid of background

# find a point per state to draw circles
summary <- crime_map %>% 
  group_by(region) %>% 
  summarise(mean_long = mean(long),
            mean_lat = mean(lat),
            UrbanPop = first(UrbanPop))

# first draw a background map, then draw the circles on top

ggplot()+
  geom_polygon(data = crime_map, aes(x=long, y=lat, group = group), fill = "gray90", colour = "gray30", size = 0.25)+
  theme_void() +
  geom_point(data = summary, aes(x = mean_long, y = mean_lat, size = UrbanPop), color = "blue", alpha = 0.6) +
  scale_size_area(max_size = 10)

Task 2

Dataset

  • A cleaned data frame based off of a dataset on personal well-being estimates in the UK is provided. The original data and its description can be found here.
  • The provided dataset (wellbeing_uk_avg.RData) includes estimates of life satisfaction, worthwhile, happiness and anxiety at the local authority level.
    • time : 2011-12 2012-13 2013-14 2014-15 2015-16 2016-17 2017-18 2018-19 2019-20
    • geography_id : use this key to join to the polygon data
    • geography_name : name of the region
    • measure: anxiety, happiness, life-satisfaction, worthwhile
    • estimate: “average-mean”
    • score: the average score of 0 - 10. Note that some values are missing.
  • The question asked to collect the data: “Next I would like to ask you four questions about your feelings on aspects of your life. There are no right or wrong answers. For each of these questions I’d like you to give an answer on a scale of 0 to 10, where 0 is “not at all” and 10 is “completely”.
    • Life satisfaction: Overall, how satisfied are you with your life nowadays?
    • Worthwhile: Overall, to what extent do you feel that the things you do in your life are worthwhile?
    • Happiness: Overall, how happy did you feel yesterday?
    • Anxiety: On a scale where 0 is “not at all anxious” and 10 is “completely anxious”, overall, how anxious did you feel yesterday?
      • Source: Office for National Statistics
  • The polygon data to draw shapes in ggplot2 is provided (mapdata_uk.RData). This data was prepared from a shapefile downloaded from here.
# To get you started, here is how you can load the data. Change the file path accordingly to your working directory. 
load(file = "wellbegin_uk_avg.RData")
load(file = "mapdata_uk.RData")

# merge datasets together
wellbeing_data_map <- mapdata %>% 
  left_join(wellbeing_data_avg, by = c("id" = "geography_id")) %>% 
  arrange(group, order)

# we create a dataset for happiness values and period 2019-20

library(dplyr)

wellbeing_happiness_map <- filter(wellbeing_data_map, measure == 'happiness')
wellbeing_happiness_map <- filter(wellbeing_happiness_map, time == '2019-20')


library(RColorBrewer)

# To draw the empty map. Modify the code below.
ggplot()+
  geom_polygon(data = wellbeing_happiness_map, aes(x = long, y = lat, group = group, fill = score), 
               color ="white", size = 0.25)+
  coord_equal() +
  ggtitle("Happiness score UK") +
  scale_fill_gradientn(colors = brewer.pal(8,"Blues")) +
  theme_void()

#display.brewer.all()
# Load the data
load(file = "wellbegin_uk_avg.RData")
load(file = "mapdata_uk.RData")

# merge datasets together
wellbeing_data_map <- mapdata %>% 
  left_join(wellbeing_data_avg, by = c("id" = "geography_id")) %>% 
  arrange(group, order)

# we create a dataset for life-satisfaction values and period 2019-20

library(dplyr)

wellbeing_life_satisfaction_map <- filter(wellbeing_data_map, measure == 'life-satisfaction')
wellbeing_life_satisfaction_map <- filter(wellbeing_life_satisfaction_map, time == '2019-20')


library(RColorBrewer)

# Draw the map
ggplot()+
  geom_polygon(data = wellbeing_life_satisfaction_map, aes(x = long, y = lat, group = group, fill = score), 
               color ="white", size = 0.25)+
  ggtitle("Life satisfaction score UK") +
  coord_equal() +
  scale_fill_gradientn(colors = brewer.pal(8,"Blues")) +
  theme_void()

Insight 1

The variance is low in the two different measures (happiness and life satisfaction mean) as we can see from the highest and lowest mean scores on the chart. This could add some difficulty in reading the map, as low variations might not be meaningful.

Insight 2

By looking at the happiness and life satisfaction visualisations we can see that the region of Northern Ireland has on average high mean scores. We can also detect that the colour becomes lighter around big cities as London and Manchester, which could be explained as city life could be more problematic and affect the feeling of its inhabitants.