5.3 Collect tweets

The first order of the business is to collect the tweets. Ideally, you would like to collect the tweets about a topic that is trending on the day you are reading this text. Currently28 for me it is “Liverpool” referring to the football (soccer) team in the UK. Liverpool beat Barcelona to qualify for the Champions League final.

Load up the necessary libraries

library(rtweet)  # Twitter package
library(dplyr)
library(ggplot2)
library(sf)   # For making maps
library(usmap)
library(reshape2)

# Packages for text analysis and wordcloud
library(tm)
library(syuzhet)
library(tidytext)
library(ggwordcloud)
library(tm)
library(SnowballC)
library(wordcloud)
library(RColorBrewer)

Next, load your Twitter token into the environment. Note that you are going to read this file from the place you saved your token. I am assuming that your token is saved with “twitter_token” as the name. Again, note that there is no file extension because you saved it without an extension.

load(here::here("twitter_token"))

After this, you should see your Twitter token in the Global Environment on the top right window in RStudio.

5.3.1 Get the tweets mentioning “Liverpool”

We will use search_tweets() function to search and download tweets. Twitter’s rate limit restricts downloading 18,000 tweets every 15 minutes. If you want to download more tweets, you will have to accordingly wait.29 We will download 18,000 tweets from the USA sent out in English. You can easily change these as you want.

lp <-  search_tweets(q = "Liverpool", 
                     lang = 'en',
                     geocode = lookup_coords("usa"),
                     n = 18000, 
                     include_rts = FALSE, # exclude retweets
                     )

The search and download usually takes only about a couple of minute or so. Twitter’s API returns all the data in json format but rtweet converts it into a data.frame.


  1. May 7, 2019

  2. For instance, if you want to download 20,000 tweets, Twitter will first download 18,000 tweets and then the rate limit will set in. You will have to wait for about 15 minutes after which the remaining 2,000 tweets will be downloaded.