6 min read

Where they fell: Looking at Meteorites!

One of my childhood dreams has been to up close to a meteorite. Imagine. Being in close proximity to something that has travelled a unimaginably large distance present right in front of you.

While this dataset by the Tidy Tuesday project does not quite let me do that, it still makes for a nice exercise in animation via the gganimate package.

As always, the first thing I do is load in the tidyverse, followed by loading in the data and inspecting it briefly.

library(tidyverse)
meteorites <- readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-06-11/meteorites.csv")

theme_set(theme_light()) #personal theme preference

meteorites %>% 
  glimpse()
## Observations: 45,716
## Variables: 10
## $ name        <chr> "Aachen", "Aarhus", "Abee", "Acapulco", "Achiras",...
## $ id          <dbl> 1, 2, 6, 10, 370, 379, 390, 392, 398, 417, 423, 42...
## $ name_type   <chr> "Valid", "Valid", "Valid", "Valid", "Valid", "Vali...
## $ class       <chr> "L5", "H6", "EH4", "Acapulcoite", "L6", "EH4", "LL...
## $ mass        <dbl> 21, 720, 107000, 1914, 780, 4239, 910, 30000, 1620...
## $ fall        <chr> "Fell", "Fell", "Fell", "Fell", "Fell", "Fell", "F...
## $ year        <dbl> 1880, 1951, 1952, 1976, 1902, 1919, 1949, 1814, 19...
## $ lat         <dbl> 50.77500, 56.18333, 54.21667, 16.88333, -33.16667,...
## $ long        <dbl> 6.08333, 10.23333, -113.00000, -99.90000, -64.9500...
## $ geolocation <chr> "(50.775, 6.08333)", "(56.18333, 10.23333)", "(54....

We get a few columns that immediately pop out to me, specifically the lat and long columns. This brings to mind the idea of visualizing a world map, which is what we will be proceeding to do.

The year column also gives us the possibility of getting a few insights that related to time, which can be done by the lubridate package.

library(lubridate)
## 
## Attaching package: 'lubridate'
## The following object is masked from 'package:base':
## 
##     date
meteorites %>% 
  summary()
##      name                 id         name_type            class          
##  Length:45716       Min.   :    1   Length:45716       Length:45716      
##  Class :character   1st Qu.:12689   Class :character   Class :character  
##  Mode  :character   Median :24262   Mode  :character   Mode  :character  
##                     Mean   :26890                                        
##                     3rd Qu.:40657                                        
##                     Max.   :57458                                        
##                                                                          
##       mass              fall                year           lat        
##  Min.   :       0   Length:45716       Min.   : 860   Min.   :-87.37  
##  1st Qu.:       7   Class :character   1st Qu.:1987   1st Qu.:-76.71  
##  Median :      33   Mode  :character   Median :1998   Median :-71.50  
##  Mean   :   13278                      Mean   :1992   Mean   :-39.12  
##  3rd Qu.:     203                      3rd Qu.:2003   3rd Qu.:  0.00  
##  Max.   :60000000                      Max.   :2101   Max.   : 81.17  
##  NA's   :131                           NA's   :291    NA's   :7315    
##       long         geolocation       
##  Min.   :-165.43   Length:45716      
##  1st Qu.:   0.00   Class :character  
##  Median :  35.67   Mode  :character  
##  Mean   :  61.07                     
##  3rd Qu.: 157.17                     
##  Max.   : 354.47                     
##  NA's   :7315
meteorites %>% 
  mutate(class = as.factor(class)) %>% 
  glimpse()
## Observations: 45,716
## Variables: 10
## $ name        <chr> "Aachen", "Aarhus", "Abee", "Acapulco", "Achiras",...
## $ id          <dbl> 1, 2, 6, 10, 370, 379, 390, 392, 398, 417, 423, 42...
## $ name_type   <chr> "Valid", "Valid", "Valid", "Valid", "Valid", "Vali...
## $ class       <fct> "L5", "H6", "EH4", "Acapulcoite", "L6", "EH4", "LL...
## $ mass        <dbl> 21, 720, 107000, 1914, 780, 4239, 910, 30000, 1620...
## $ fall        <chr> "Fell", "Fell", "Fell", "Fell", "Fell", "Fell", "F...
## $ year        <dbl> 1880, 1951, 1952, 1976, 1902, 1919, 1949, 1814, 19...
## $ lat         <dbl> 50.77500, 56.18333, 54.21667, 16.88333, -33.16667,...
## $ long        <dbl> 6.08333, 10.23333, -113.00000, -99.90000, -64.9500...
## $ geolocation <chr> "(50.775, 6.08333)", "(56.18333, 10.23333)", "(54....
#Inspecting for missing data
mean(is.na(meteorites$lat))
## [1] 0.1600096
mean(is.na(meteorites$long))
## [1] 0.1600096
mean(is.na(meteorites$mass))
## [1] 0.002865518
meteorites %>% 
  count(name, sort = TRUE)
## # A tibble: 45,716 x 2
##    name                n
##    <chr>           <int>
##  1 Aachen              1
##  2 Aarhus              1
##  3 Abajo               1
##  4 Abar al' Uj 001     1
##  5 Abbott              1
##  6 Abee                1
##  7 Abernathy           1
##  8 Abo                 1
##  9 Abu Moharek         1
## 10 Acapulco            1
## # ... with 45,706 more rows
#Years with the most recorded impacts
meteorites %>% 
  count(year, sort = TRUE)
## # A tibble: 266 x 2
##     year     n
##    <dbl> <int>
##  1  2003  3323
##  2  1979  3046
##  3  1998  2697
##  4  2006  2456
##  5  1988  2296
##  6  2002  2078
##  7  2004  1940
##  8  2000  1792
##  9  1997  1696
## 10  1999  1691
## # ... with 256 more rows
#Extracting the century and decade from the year variable.
meteoritesProcessed <- meteorites %>% 
  mutate(century = year%/%100,
         decade = ((year%%100)%/%10)*10,
         class = fct_lump(class, 10))  #Keeping the top 10 most common classes of meteorites.

#Counting the lumped classes
meteorites %>% 
  mutate(class = fct_lump(class, 10)) %>% 
  count(class, sort = TRUE)
## # A tibble: 11 x 2
##    class     n
##    <fct> <int>
##  1 Other  9734
##  2 L6     8339
##  3 H5     7164
##  4 L5     4817
##  5 H6     4529
##  6 H4     4222
##  7 LL5    2766
##  8 LL6    2045
##  9 L4     1256
## 10 H4/5    428
## 11 CM2     416

This brings us to the exciting stuff. Before we get into it, there are a few concepts I’d like to highlight here:

  • The borders() function is a quick and easy way to get political borders onto my ggplot object.
  • The transition_states() function in the gganimate package allows us to define the various states of our animation. Basically, this is the function that allows us to link various visualizations depicting different values of the same variable into one animation.
  • The enter_fade() function does exactly what it sounds like: it decides the entry animation of a new frame of our animation.
  • The ease_aes() function allows tweening of the frames in our animation. It interpolates individual frames based on the functions we pass as parameters to it.

With those concepts out in the open, we can create our animation now!

library(gganimate)
library(RColorBrewer)

#Storing the number of classes
numCols <- meteoritesProcessed %>%
            distinct(class) %>%
            nrow()

#Creating a custom colour palette with more colours than the default 8
custColors <- colorRampPalette(brewer.pal(8,"Dark2"))(numCols)


meteoritesProcessed %>% 
  group_by(century) %>% 
  filter(n()>10) %>%  #removing infrequent groups
  ungroup() %>% 
  filter(!is.na(mass), #removing missing data
         !is.na(century),
         !is.na(class)) %>% 
  ggplot(aes(long, lat)) +
  borders() + #adding world map borders to our plot
  geom_point(aes(color = class, size = mass, group = century), alpha = 0.1) +
  transition_states(century) +  #unique frame for each century
  enter_fade() +  #default fade in animation
  ease_aes('exponential-in') +  #using an exponential function to tween our frames
  scale_color_manual(values = custColors) +
  scale_size_continuous(range = c(2,12)) +
  coord_map() +
  expand_limits(x = c(-200,200)) +
  labs(x = "",
       y = "",
       title = "Meteorite impacts over the centuries",
       subtitle = paste("{closest_state}","th", "century")) +  #displaying the current state of the animation
  theme(legend.position = "bottom",
        legend.direction = "horizontal",
        legend.box = "vertical",
        plot.title = element_text(size = 14, face = "bold")) +
  labs(colour = "Class",
       size = "Mass") +
  guides(colour = guide_legend(override.aes = list(alpha = 1)))  #default behaviour makes the legend transparent, this overrides it

anim_save("meteoritesGif.gif")

And we’re done! This was a quick blog post I wrote up based off the visualization I made a while back. I’ve always enjoyed plotting maps, and this might be one of my favorite recent visualizations I’ve done!