Analysing overall sentiment

In part 1, we trained a sentiment classification model and used it to predict the sentiment of tweets about COVID-19 vaccines. Our focus in this part will be to analyse the results from our model.

Note: As mentioned in part 1, this is a write-up of a submission I made for several Kaggle tasks, which are still open and accepting new entries at the time of writing if you want to give them a go yourself! See the conclusion for some ideas.

First, let's load in the data from part 1 and plot the frequency of each sentiment.

vax_tweets = pd.read_csv('https://raw.githubusercontent.com/twhelan22/blog/master/data/vax_tweets_inc_sentiment.csv', index_col=0, parse_dates=['date'])

# Plot sentiment value counts
vax_tweets['sentiment'].value_counts(normalize=True).plot.bar(title='COVID-19 vaccine tweet sentiment');

We can see that the predominant sentiment is neutral, with more positive tweets than negative. It's encouraging that negative sentiment isn't higher! We can also visualise how sentiment changes over time:

# Get counts of number of tweets by sentiment for each date
timeline = vax_tweets.groupby(['date', 'sentiment']).agg(**{'tweets': ('id', 'count')}).reset_index().dropna()

# Plot results
fig = px.line(timeline, x='date', y='tweets', color='sentiment', category_orders={'sentiment': ['neutral', 'negative', 'positive']},
             title='Timeline showing sentiment of tweets about COVID-19 vaccines')
fig.show()

There was a big spike in the number of tweets on March 1st 2021, so let's investigate further. A lot of the tweets appear to be from users in India:

spike = vax_tweets[vax_tweets['date'].astype(str)=='2021-03-01']
spike['user_location'].value_counts(ascending=False).head(10)

India               258
New Delhi, India    138
patna                52
Mumbai, India        48
New Delhi            46
Bengaluru, India     32
Mumbai               28
Delhi                26
Hyderabad, India     24
Pune, India          22
Name: user_location, dtype: int64

spike = spike.sort_values('user_location', ascending=False)
spike['orig_text'].head()

18084    Before magreact, do the research how the vacci...
17555    I find this Photo by @cpimspeak\nTo be offensi...
15285    🇮🇳 PM Shri @narendramodi took his first dose o...
16532    Got call at 9 am from health department and mo...
16901    #mRNAvaccine #PfizerBionTech\n#Moderna #Katali...
Name: orig_text, dtype: object

It looks like Indian Prime Minister Narendra Modi received the first dose of Indian developed Covaxin on 1st March. No wonder there were lots of tweets! To dig deeper, let's plot timelines for each vaccine indvidually.

Timeline analysis for each vaccine

Covaxin

all_vax = ['covaxin', 'sinopharm', 'sinovac', 'moderna', 'pfizer', 'biontech', 'oxford', 'astrazeneca', 'sputnik']

# Function to filter the data to a single vaccine
# Note: a lot of the tweets seem to contain hashtags for multiple vaccines even though they are specifically referring to one vaccine - not very helpful!
def filtered_df(df, vax):
    df = df.dropna()
    df_filt = pd.DataFrame()
    for o in vax:
        df_filt = df_filt.append(df[df['orig_text'].str.lower().str.contains(o)])
    other_vax = list(set(all_vax)-set(vax))
    for o in other_vax:
        df_filt = df_filt[~df_filt['orig_text'].str.lower().str.contains(o)]
    df_filt = df_filt.drop_duplicates()
    return df_filt

# Function to plot the timeline
def plot_timeline(df, title):
    title_str = 'Timeline showing sentiment of tweets about the '+title+' vaccine'
    timeline = df.groupby(['date', 'sentiment']).agg(**{'tweets': ('id', 'count')}).reset_index()
    fig = px.line(timeline, x='date', y='tweets', color='sentiment', category_orders={'sentiment': ['neutral', 'negative', 'positive']}, title=title_str)
    fig.show()
    
covaxin = filtered_df(vax_tweets, ['covaxin'])
plot_timeline(covaxin, title='Covaxin')

# Function to filter the data to a single date and print tweets from users with the most followers
def date_filter(df, date):
    return df[df['date'].astype(str)==date].sort_values('user_followers', ascending=False)[['date' ,'orig_text']]

def date_printer(df, dates, num=10): 
    for date in dates:
        display(date_filter(df, date).head(num))

date_printer(covaxin, ['2021-03-01', '2021-03-03'])

Modi wasn't the only person to make news on March 1st; India's External Affairs Minister and a 100-year-old Hyderabad resident also received their first dose of Covaxin. On March 3rd, phase 3 trial results for Covaxin were published, showing 81% efficacy. It makes sense for there to be a spike in the number of neutral and positive tweets about Covaxin on those dates!

Sinovac

sinovac = filtered_df(vax_tweets, ['sinovac'])
plot_timeline(sinovac, title='Sinovac')

Some notable dates:

date_printer(sinovac, ['2021-02-22', '2021-02-28', '2021-03-01', '2021-03-03', '2021-03-08'], 3)

These tweets are about countries starting their vaccination programme or receiving a new shipment of vaccines. Let's use the 'COVID-19 World Vaccination Progress' dataset to plot daily vaccinations for the mentioned countries:

vax_progress = pd.read_csv('https://raw.githubusercontent.com/twhelan22/blog/master/data/country_vaccinations.csv', index_col=0, parse_dates=['date'])
countries = ['Brazil', 'Thailand', 'Hong Kong', 'Colombia', 'Mexico', 'Philippines', 'Indonesia']
fig = px.line(vax_progress[vax_progress['country'].isin(countries)], x='date', y='daily_vaccinations_per_million', color='country',
             title='Daily vaccinations per million (all vaccines) in selected countries')
fig.show()

We can see that daily vaccinations per million increased significantly in Colombia and Mexico after they received new shipments of vaccines. Daily vaccinations are also increasing rapidly in Hong Kong after Carrie Lam received the vaccine on February 22nd; however, progress has been slower in Thailand and the Philippines so far.

Sinopharm

sinopharm = filtered_df(vax_tweets, ['sinopharm'])
plot_timeline(sinopharm, title='Sinopharm')

As with Sinovac, most of the Sinopharm tweets appear to be positive news regarding countries receiving a shipment of the vaccine:

date_printer(sinopharm, ['2021-02-18', '2021-02-24', '2021-03-02'], 3)

countries = ['Senegal', 'Nepal', 'Hungary', 'Bolivia', 'Lebanon']
fig = px.line(vax_progress[vax_progress['country'].isin(countries)], x='date', y='daily_vaccinations_per_million', color='country',
             title='Daily vaccinations per million (all vaccines) in selected countries')
fig.show()

We can see that Hungary ramped up their vaccination programme after the news on February 18th that they would become the first EU country to start administering Sinopharm. In addition, Senegal started vaccinating shortly after positive tweets confirmed that they had received a shipment of Sinopharm vaccines. Unfortunately there is no data for Iraq, but they also started their programme just hours after receiving a donation of vaccines from China.

Moderna

moderna = filtered_df(vax_tweets, ['moderna'])
plot_timeline(moderna, title='Moderna')

Some notable dates:

date_printer(moderna, ['2021-02-17', '2021-03-05', '2021-03-11'], 3)

On March 2nd Dolly Parton received her dose of the vaccine she helped fund, which explains the initial increase in positive tweets prior to the news about Moderna's collaboration with IBM. By looking at the vaccine progress, we can see that the median daily vaccinations per million in EU countries started to pull further ahead of the rest of the world after news that they would purchase up to 300m extra Moderna vaccines:

countries = ['Austria', 'Belgium', 'Bulgaria', 'Croatia', 'Cyprus', 'Czechia', 'Denmark', 
             'Estonia', 'Finland', 'France', 'Germany', 'Greece', 'Hungary', 'Ireland', 'Italy', 
             'Latvia', 'Lithuania', 'Luxembourg', 'Malta', 'Netherlands', 'Poland', 'Portugal', 
             'Romania', 'Slovakia', 'Slovenia', 'Spain','Sweden']
eu = vax_progress[vax_progress['country'].isin(countries)].groupby('date')['daily_vaccinations_per_million'].median().reset_index()
eu['region'] = 'EU'
row = vax_progress[~vax_progress['country'].isin(countries)].groupby('date')['daily_vaccinations_per_million'].median().reset_index()
row['region'] = 'Rest of world'
fig = px.line(eu.append(row), x='date', y='daily_vaccinations_per_million', color='region',
             title='Median daily vaccinations per million (all vaccines) in EU countries vs the rest of the world')

fig.add_vline(x='2021-02-17', line_width=3, line_dash='dash', line_color='#00cc96')

fig.add_annotation(x='2021-02-17', y=2120,
            text="EU makes a deal to purchase up to 300m extra Moderna vaccines",
            showarrow=True,
            arrowhead=5, ax=-220, ay=-30)

fig.show()

Sputnik V

sputnikv = filtered_df(vax_tweets, ['sputnik'])
plot_timeline(sputnikv, title='Sputnik V')

Some notable dates:

date_printer(sputnikv, ['2021-03-04', '2021-03-05', '2021-03-10', '2021-03-11', '2021-03-15'], 3)

We can see spikes in positive sentiment after various countries agreed to produce the Sputkik V vaccine, and on March 11th after ABC news reported that it was the safest vaccine.

Pfizer/BioNTech

pfizer = filtered_df(vax_tweets, ['pfizer', 'biontech'])
plot_timeline(pfizer, title='Pfizer/BioNTech')

There is a lot to unpack here, so to make things easier let's annotate some of the key dates:

timeline = pfizer.groupby(['date', 'sentiment']).agg(**{'tweets': ('id', 'count')}).reset_index()

fig = px.line(timeline, x='date', y='tweets', color='sentiment', category_orders={'sentiment': ['neutral', 'negative', 'positive']},
              title='Timeline showing sentiment of tweets about the PfizerBioNTech vaccine')

fig.add_annotation(x='2020-12-14', y=timeline[(timeline['date']=='2020-12-14')&(timeline['sentiment']=='positive')]['tweets'].values[0],
            text="USA and UK start vaccinating",
            showarrow=True,
            arrowhead=3, ax=55, ay=-210)

fig.add_annotation(x='2020-12-22', y=timeline[(timeline['date']=='2020-12-22')&(timeline['sentiment']=='positive')]['tweets'].values[0],
            text="Joe Biden receives first dose",
            arrowhead=3, ax=10, ay=-100)

fig.add_annotation(x='2021-01-08', y=timeline[(timeline['date']=='2021-01-08')&(timeline['sentiment']=='positive')]['tweets'].values[0],
            text="Vaccine shown to resist new variant",
            showarrow=True, align='left',
            arrowhead=3, ax=0, ay=-45)

fig.add_annotation(x='2021-01-16', y=timeline[(timeline['date']=='2021-01-16')&(timeline['sentiment']=='negative')]['tweets'].values[0],
            text="23 elderly Norwegians die after vaccine dose",
            showarrow=True, align='left',
            arrowhead=3, ax=15, ay=-180)

fig.add_annotation(x='2021-02-19', y=timeline[(timeline['date']=='2021-02-19')&(timeline['sentiment']=='positive')]['tweets'].values[0],
            text="Israeli study shows 85% efficacy after one dose",
            showarrow=True, align='left',
            arrowhead=3, ax=-30, ay=-180)

fig.add_annotation(x='2021-02-25', y=timeline[(timeline['date']=='2021-02-25')&(timeline['sentiment']=='positive')]['tweets'].values[0],
            text="Israeli study shows 94% efficacy after two doses",
            showarrow=True, align='left',
            arrowhead=3, ax=-20, ay=-130)

fig.show()

Oxford/AstraZeneca

oxford = filtered_df(vax_tweets, ['oxford', 'astrazeneca'])
plot_timeline(oxford, title='Oxford/AstraZeneca')

Interestingly, there are small positive spikes on February 19th and March 6th, with people tweeting after receiving the vaccine:

date_printer(oxford, ['2021-02-19', '2021-03-06'], 5)

However, negative sentiment is increasing after numerous countries have suspended the use of the vaccine over safety concerns. We can see that vaccination progress in these countries has slowed significantly over the past few days as a result:

# At the time of writing, these countries have completely suspended the use of the vaccine
# Note that several other countries continued mostly as normal but suspended the use of one batch of Oxford/AstraZeneca vaccines
countries = ['Germany', 'France', 'Spain', 'Italy', 'Netherlands', 'Ireland', 'Denmark', 'Norway', 'Bulgaria', 'Iceland', 'Thailand']
ox_prog = vax_progress[vax_progress['country'].isin(countries)].groupby('date')['daily_vaccinations_per_million'].median().reset_index()
ox_prog['Use of Oxford/AstraZeneca'] = 'Suspended'
other_prog = vax_progress[vax_progress['vaccines'].str.contains('Oxford/AstraZeneca')]
other_prog = vax_progress[~vax_progress['country'].isin(countries)].groupby('date')['daily_vaccinations_per_million'].median().reset_index()
other_prog['Use of Oxford/AstraZeneca'] = 'Ongoing'
fig = px.line(ox_prog.append(other_prog), x='date', y='daily_vaccinations_per_million', color='Use of Oxford/AstraZeneca',
             title="Median daily vaccinations per million (all vaccines) in countries that have completely suspended the use of the\
              <br>Oxford/AstraZeneca vaccine vs countries that continue to use it")
fig.add_vrect(x0="2021-03-11", x1="2021-03-15", 
              annotation_text="vaccine<br>suspended", annotation_position="bottom right",
              fillcolor="limegreen", opacity=0.25, line_width=0)
fig.show()

The overall sentiment of the Oxford/AstraZeneca vaccine is therefore significantly more negative than average:

# Get z scores of sentiment for each vaccine
vax_names = {'Covaxin': covaxin, 'Sinovac': sinovac, 'Sinopharm': sinopharm,
            'Moderna': moderna, 'Oxford/AstraZeneca': oxford, 'PfizerBioNTech': pfizer}
sentiment_zscores = pd.DataFrame()
for k, v in vax_names.items():
    senti = v['sentiment'].value_counts(normalize=True)
    senti['vaccine'] = k
    sentiment_zscores = sentiment_zscores.append(senti)
for col in ['negative', 'neutral', 'positive']:
    sentiment_zscores[col+'_zscore'] = (sentiment_zscores[col] - sentiment_zscores[col].mean())/sentiment_zscores[col].std(ddof=0)
sentiment_zscores.set_index('vaccine', inplace=True)

# Plot the results
ax = sentiment_zscores.sort_values('negative_zscore')['negative_zscore'].plot.barh(title='Z scores of negative sentiment')
ax.set_ylabel('Vaccine')
ax.set_xlabel('Z score');

Further analysis using 'smarter' word clouds

The final thing we will do is to generate word clouds to see which words are indicative of each sentiment. The code below is from this notebook, which contains a more detailed explanation of the methodology used to generate 'smarter' word clouds. Please go and upvote the original notebook if you find this part useful!

!pip install -q wordninja
!pip install -q pyspellchecker
from wordcloud import WordCloud, ImageColorGenerator
import wordninja
from spellchecker import SpellChecker
from collections import Counter
import matplotlib.pyplot as plt
import re
import math
import random
import nltk
nltk.download('wordnet')
nltk.download('stopwords')
from nltk.stem import WordNetLemmatizer
from nltk.corpus import stopwords 
stop_words = set(stopwords.words('english'))  
stop_words.add("amp")

[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!

# FUNCTIONS REQUIRED
def flatten_list(l):
    return [x for y in l for x in y]

def is_acceptable(word: str):
    return word not in stop_words and len(word) > 2

# Color coding our wordclouds 
def red_color_func(word, font_size, position, orientation, random_state=None,**kwargs):
    return f"hsl(0, 100%, {random.randint(25, 75)}%)" 

def green_color_func(word, font_size, position, orientation, random_state=None,**kwargs):
    return f"hsl({random.randint(90, 150)}, 100%, 30%)" 

def yellow_color_func(word, font_size, position, orientation, random_state=None,**kwargs):
    return f"hsl(42, 100%, {random.randint(25, 50)}%)" 

# Reusable function to generate word clouds 
def generate_word_clouds(neg_doc, neu_doc, pos_doc):
    # Display the generated image:
    fig, axes = plt.subplots(1,3, figsize=(20,10))
    
    wordcloud_neg = WordCloud(max_font_size=50, max_words=100, background_color="white").generate(" ".join(neg_doc))
    axes[0].imshow(wordcloud_neg.recolor(color_func=red_color_func, random_state=3), interpolation='bilinear')
    axes[0].set_title("Negative Words")
    axes[0].axis("off")

    wordcloud_neu = WordCloud(max_font_size=50, max_words=100, background_color="white").generate(" ".join(neu_doc))
    axes[1].imshow(wordcloud_neu.recolor(color_func=yellow_color_func, random_state=3), interpolation='bilinear')
    axes[1].set_title("Neutral Words")
    axes[1].axis("off")

    wordcloud_pos = WordCloud(max_font_size=50, max_words=100, background_color="white").generate(" ".join(pos_doc))
    axes[2].imshow(wordcloud_pos.recolor(color_func=green_color_func, random_state=3), interpolation='bilinear')
    axes[2].set_title("Positive Words")
    axes[2].axis("off")

    plt.tight_layout()
    plt.show();

def get_top_percent_words(doc, percent):
    # Returns a list of "top-n" most frequent words in a list 
    top_n = int(percent * len(set(doc)))
    counter = Counter(doc).most_common(top_n)
    top_n_words = [x[0] for x in counter]
    
    return top_n_words
    
def clean_document(doc):
    spell = SpellChecker()
    lemmatizer = WordNetLemmatizer()
    
    # Lemmatize words (needed for calculating frequencies correctly )
    doc = [lemmatizer.lemmatize(x) for x in doc]
    
    # Get the top 10% of all words. This may include "misspelled" words 
    top_n_words = get_top_percent_words(doc, 0.1)

    # Get a list of misspelled words 
    misspelled = spell.unknown(doc)
    
    # Accept the correctly spelled words and top_n words 
    clean_words = [x for x in doc if x not in misspelled or x in top_n_words]
    
    # Try to split the misspelled words to generate good words (ex. "lifeisstrange" -> ["life", "is", "strange"])
    words_to_split = [x for x in doc if x in misspelled and x not in top_n_words]
    split_words = flatten_list([wordninja.split(x) for x in words_to_split])
    
    # Some splits may be nonsensical, so reject them ("llouis" -> ['ll', 'ou', "is"])
    clean_words.extend(spell.known(split_words))
    
    return clean_words

def get_log_likelihood(doc1, doc2):    
    doc1_counts = Counter(doc1)
    doc1_freq = {
        x: doc1_counts[x]/len(doc1)
        for x in doc1_counts
    }
    
    doc2_counts = Counter(doc2)
    doc2_freq = {
        x: doc2_counts[x]/len(doc2)
        for x in doc2_counts
    }
    
    doc_ratios = {
        # 1 is added to prevent division by 0
        x: math.log((doc1_freq[x] +1 )/(doc2_freq[x]+1))
        for x in doc1_freq if x in doc2_freq
    }
    
    top_ratios = Counter(doc_ratios).most_common()
    top_percent = int(0.1 * len(top_ratios))
    return top_ratios[:top_percent]

# Function to generate a document based on likelihood values for words 
def get_scaled_list(log_list):
    counts = [int(x[1]*100000) for x in log_list]
    words = [x[0] for x in log_list]
    cloud = []
    for i, word in enumerate(words):
        cloud.extend([word]*counts[i])
    # Shuffle to make it more "real"
    random.shuffle(cloud)
    return cloud

# Convert string to a list of words
vax_tweets['words'] = vax_tweets.text.astype(str).apply(lambda x:re.findall(r'\w+', x ))

def get_smart_clouds(df):

    neg_doc = flatten_list(df[df['sentiment']=='negative']['words'])
    neg_doc = [x for x in neg_doc if is_acceptable(x)]

    pos_doc = flatten_list(df[df['sentiment']=='positive']['words'])
    pos_doc = [x for x in pos_doc if is_acceptable(x)]

    neu_doc = flatten_list(df[df['sentiment']=='neutral']['words'])
    neu_doc = [x for x in neu_doc if is_acceptable(x)]

    # Clean all the documents
    neg_doc_clean = clean_document(neg_doc)
    neu_doc_clean = clean_document(neu_doc)
    pos_doc_clean = clean_document(pos_doc)

    # Combine classes B and C to compare against A (ex. "positive" vs "non-positive")
    top_neg_words = get_log_likelihood(neg_doc_clean, flatten_list([pos_doc_clean, neu_doc_clean]))
    top_neu_words = get_log_likelihood(neu_doc_clean, flatten_list([pos_doc_clean, neg_doc_clean]))
    top_pos_words = get_log_likelihood(pos_doc_clean, flatten_list([neu_doc_clean, neg_doc_clean]))

    # Generate syntetic a corpus using our loglikelihood values 
    neg_doc_final = get_scaled_list(top_neg_words)
    neu_doc_final = get_scaled_list(top_neu_words)
    pos_doc_final = get_scaled_list(top_pos_words)

    # Visualise our synthetic corpus
    generate_word_clouds(neg_doc_final, neu_doc_final, pos_doc_final)
    
get_smart_clouds(vax_tweets)

This looks pretty good! The positive tweets appear to be from people who have just received their first vaccine or are grateful for the job scientists and healthcare workers are doing, whereas the negative tweets seem to be from people who have suffered adverse reactions to the vaccine. The neutral tweets seem to be more like news, which could explain why it is the most prevelant sentiment; in fact, the vast majority of tweets contain urls:

vax_tweets['has_url'] = np.where(vax_tweets['orig_text'].str.contains('http'), 'yes', 'no')
vax_tweets['has_url'].value_counts(normalize=True).plot.bar(title='Does the tweet contain a url?');

Interestingly, Canada shows up in the negative word cloud, as well as a couple of Canadian cities. Looking at a 'naive' word cloud for tweets containing 'Canada' shows us that this appears to be a political/economic issue:

def get_cloud(df, string, c_func):
    string_l = string.lower()
    df[string_l] = np.where(df['text'].str.lower().str.contains(string_l), 1, 0)
    cloud_df = df.copy()[df[string_l]==1]
    doc = flatten_list(cloud_df['words'])
    doc = [x for x in doc if is_acceptable(x)]
    doc = clean_document(doc)
    fig, axes = plt.subplots(figsize=(9,5))
    wordcloud = WordCloud(max_font_size=50, max_words=100, background_color="white").generate(" ".join(doc))
    axes.imshow(wordcloud.recolor(color_func=c_func, random_state=3), interpolation='bilinear')
    axes.set_title("Naive word cloud for tweets Containg '%s'" % (string))
    axes.axis("off")
    plt.show();
    
get_cloud(vax_tweets, 'Canada', red_color_func)

At the time of writing Canada's vaccination progress has been slower than other developed nations, and people are predicting that it might have an impact on Canada's economic recovery:

countries = ['Canada', 'United Kingdom', 'United States', 'Chile', 'Singapore', 'Israel', 'Australia']
selected = vax_progress[vax_progress['country'].isin(countries)]
eu['country'] = 'EU median'
fig = px.line(vax_progress[vax_progress['country'].isin(countries)].append(eu), x='date', y='daily_vaccinations_per_million', color='country',
             title='Daily vaccinations per million (all vaccines) in Canada vs selected other developed nations')
fig.show()

Conclusion

We were able to gain some interesting insights here, so hopefully you found this useful! That said, there is still a lot left to explore, especially since vaccinations are ongoing and the dataset is still being updated at the time of writing (thanks once again to Gabriel Preda for providing the data).

If you made it this far, I encourage you to give this task a go yourself and see what you can find out! A couple of suggestions:

Try to improve the accuracy of the fastai models we created in part 1.
Instead of looking at each vaccine individually, investigate each vaccination scheme (most countries are using more than one vaccine).
Dig deeper on the sentiment in a specific country and how that relates to vaccination progress. You could even analyse a large dataset of all COVID-19 tweets, not just vaccine specific ones!
Investigate adverse reactions to the vaccine and how that is reflected tweet sentiment. For instance, is blood clotting really a concern for patients who have received the Oxford/AstraZeneca vaccine?

Thanks for reading!

1. Cover image via https://www.mamamia.com.au/covid-19-vaccine-latest-update/↩

	date	orig_text
18936	2021-03-01	"Felt secure, will travel safely" EAM @DrSJais...
17463	2021-03-01	#Watch \| PM @NarendraModi was administered the...
13382	2021-03-01	@nistula Sources in the govt say PM #NarendraM...
13107	2021-03-01	PM #NarendraModi took the first shot of #COVAX...
18912	2021-03-01	There are two #CovidVaccines that are being us...
18960	2021-03-01	External Affairs Minister Jaishankar receives ...
18750	2021-03-01	A 100-year-old resident of #Hyderabad, Jaidev ...
18700	2021-03-01	#PMModi took the first does of #Covid19 vaccin...
18666	2021-03-01	#PMModi took the first dose of #Covaxin today....
18803	2021-03-01	#PMModi flagged off the second phase of #Covid...

	date	orig_text
20792	2021-03-03	#Covaxin 81% Effective, Works Against UK Varia...
20403	2021-03-03	“The numbers are extremely promising at this s...
20388	2021-03-03	“The data is quite encouraging”: Dr Rachna Kuc...
20696	2021-03-03	#Covaxin's Phase 3 Trial Results Out! #Covid19...
20411	2021-03-03	For those like me who were concerned that #Cov...
20563	2021-03-03	#Covaxin demonstrates the prowess of Atmanirbh...
20349	2021-03-03	India's vaccine maker Bharat Biotech said Wed ...
20850	2021-03-03	Bharat Biotech announces phase 3 results of Co...
20380	2021-03-03	#Covaxin is one of the two vaccines that have ...
20671	2021-03-03	.@BharatBiotech announces the phase 3 results ...

	date	orig_text
11715	2021-02-22	Thai PM Prayut Chan-o-cha possibly among first...
11757	2021-02-22	Carrie Lam, Chief Executive of #HongKong SAR, ...
11765	2021-02-22	The #Philippines has officially approved the e...

	date	orig_text
16270	2021-02-28	#Thai deputy PM and ministers are part of the ...
16253	2021-02-28	China has provided Mexico with 1 million doses...
16254	2021-02-28	Second batch of #Sinovac vaccines produced by ...

	date	orig_text
16806	2021-03-01	#Philippine General Hospital (PGH) Director Dr...
16779	2021-03-01	The #Philippines kicked off vaccination drive ...
16818	2021-03-01	A batch of #Sinovac #vaccine donated by China ...

	date	orig_text
19152	2021-03-03	Brazilian soccer legend #Pele on Tuesday recei...
19162	2021-03-03	In pics: Raw materials for China's #Sinovac #C...
19175	2021-03-03	It is extremely unlikely that the death of a 6...

	date	orig_text
23448	2021-03-08	The second batch of China's #Sinovac COVID-19 ...
23834	2021-03-08	China's #Sinovac #covid19 #vaccines show an 80...
23836	2021-03-08	#Sinovac’s #vaccine shows an 80-90% efficacy r...

	date	orig_text
9905	2021-02-18	#Senegal received its #COVID19 vaccines purcha...
10391	2021-02-18	Nepal has granted approval to China’s #Sinopha...
10380	2021-02-18	With the #Sinopharm #vaccine, Hungarians will ...

	date	orig_text
12655	2021-02-24	#Sinopharm's second COVID-19 vaccine has a 72....
13958	2021-02-24	The first batch of China's #Sinopharm vaccine ...
12680	2021-02-24	#Senegal on Tuesday officially began the first...

	date	orig_text
17972	2021-03-02	The first batch of #Sinopharm #COVID19 vaccine...
17977	2021-03-02	China will provide 50,000 inactivated #Sinopha...
17945	2021-03-02	#Iraq received its first 50,000 doses of the #...

	date	orig_text
9458	2021-02-17	#UPDATE The European Union has bought up to 30...
9464	2021-02-17	#Covid19: EU approves contract for 300 million...
9471	2021-02-17	💉🇪🇺 The European Commission said on Wednesday ...

	date	orig_text
22413	2021-03-05	#Japan’s Takeda Pharmaceutical Co asks regulat...
22422	2021-03-05	#Moderna To Collaborate With IBM On #COVID19Va...
21191	2021-03-05	Moderna COVID-19 Vaccine Recipients Experience...

	date	orig_text
27203	2021-03-11	@mpetrillo59 It was the #Moderna.
27207	2021-03-11	I got the #CovidVaccine today.\nI received the...
27370	2021-03-11	🇺🇸Utah mother, 39, with NO known health issues...

	date	orig_text
21932	2021-03-04	European Union drug regulator on Thursday star...
22012	2021-03-04	Sputnik V could be India's third #Covid19 vacc...
21991	2021-03-04	Sputnik V Could Be India’s 3rd COVID Vaccine: ...

	date	orig_text
22850	2021-03-05	[Coronavirus] EU's medicines agency @EMA_News ...
22865	2021-03-05	#SputnikV is now the world's second most popul...
22876	2021-03-05	Twitter officially Verified #SputnikV account....

	date	orig_text
30512	2021-03-11	Best #SputnikV4Victory photos will be publishe...
30513	2021-03-11	#SputnikV, approved by 50 countries, brings vi...
30494	2021-03-11	Anti-#covid19 update:\n\n🇰🇪Kenya, 🇲🇦Morocco, 🇯...

	date	orig_text
30153	2021-03-15	#NewsAlert \| #SputnikV production agreements r...
30088	2021-03-15	The developers of the #SputnikV #coronavirus #...
30044	2021-03-15	@Malinka1102 Salam, here is your unroll: #Russ...

	date	orig_text
10616	2021-02-19	Had my 1st dose of the vaccine. Very impressed...
11107	2021-02-19	@nicolab03 Hurrah! Had mine today too. #Oxford...
11108	2021-02-19	Blimey I feel crap. \n\nBut it’s totally worth...
10617	2021-02-19	#vaccine study #nurses #volunteers\n#oxfordast...
11096	2021-02-19	Our latest paper on doses of the #oxfordastraz...

	date	orig_text
23213	2021-03-06	“The #OxfordAstraZeneca #CovidVaccine develop...
23216	2021-03-06	Drive through #Whitstable for #NewHusband #Oxf...
22480	2021-03-06	@BWildeMTL Hope it gets sorted soon Brian. I h...
23211	2021-03-06	Update... If you look closely, you'll see wher...
22450	2021-03-06	EU seeks to access AstraZeneca vaccines produc...

	date	orig_text
30610	2021-03-10	Iran and Russia will start to jointly produce ...
30745	2021-03-10	#Russia has signed a deal to produce its #Sput...
30748	2021-03-10	#SputnikV has not yet been approved for use in...