6.5 Net Sentiment Score (NSS)
In this step, we will aggregate the sentiment at airline level so that we will have just 1 observation for every airline. However, note that airlines_sent
does not have any column identifying the airline. This is because we used only the text
column from that data set. In the code below, we will first add back some of the relevant variables using cbind()
. The variables of interest are airline
, favorite_count
, and retweet_count
. We will retain favorite_count
, and retweet_count
because they can be used as weights.
In the code below, I have commented the blocks. They are self explanatory. The last block where we calculate the net sentiment scores (NSS) needs some explanation. NSS are similar to the (in)famous Net Promoter Score (NPS).39 The idea is that we take the difference between the positive sentiment and negative sentiment scores and divide this difference by the total tweets (or sum of weights for weighted metric). The NSS formula in general for our case is as follows:
\[\small NSS = \frac{\sum{w_i.PS_i} - \sum{w_i.NS_i}}{\sum{w_i}} \]
where, \(\small w_i\) is the weight assigned to the tweet (i.e., number of favorites or retweets), \(\small PS_i\) is the positive sentiment score of a given tweet, and \(\small NS_i\) is the negative sentiment score of a given tweet. For raw NSS, where we do not weight by number of favorites or retweets, \(\small w_i = 1 \quad \forall i\)
There is no reason to believe that NSS will be correlated strongly with ACSI.
However, in my blog post it did and here we are assessing whether that
relationship still holds.
airlines_final <- cbind(
airlines_df %>% select(airline, favorite_count, retweet_count),
airlines_sent %>% select(negative, positive)
) %>%
# Create new "weighted" variables
mutate(negative_fav = negative * favorite_count,
positive_fav = positive * favorite_count,
negative_rt = negative * retweet_count,
positive_rt = positive * retweet_count) %>%
# Get the sum of these variables for each airline
group_by(airline) %>%
summarise(neg_sum = sum(negative),
neg_fav_sum = sum(negative_fav),
neg_rt_sum = sum(negative_rt),
pos_sum = sum(positive),
pos_fav_sum = sum(positive_fav),
pos_rt_sum = sum(positive_rt),
fav_sum = sum(favorite_count),
rt_sum = sum(retweet_count),
tot_obs = n()) %>%
ungroup() %>%
# Calculate sentiment metrics
mutate(nss = (pos_sum - neg_sum) / tot_obs,
nss_fav = (pos_fav_sum - neg_fav_sum) / fav_sum,
nss_rt = (pos_rt_sum - neg_rt_sum) / rt_sum) %>%
# Add the column of customer satisfaction
mutate(acsi = c(80, 71, 73, 75, 64, 79, 79, 63, 70))