Article Preview
TopI. Introduction
Began in December 2019, an outbreak of a novel coronavirus disease (COVID-19) was first found in China (World Health Organization, 2020). COVID-19 was officially defined as a pandemic by the World Health Organisation in March 2020 following a mass global spread. By the beginning of the current study in Feb 2021, the cumulative confirmed infection cases exceeded 110 million and the death toll over 2.5 million worldwide (World Health Organization, 2021), making it the worst public health crisis in the recent decade. Government, medical professionals and pharmaceutic companies had endeavored to develop vaccines that produce immunity to COVID-19. The goal of vaccination is to achieve herd immunity that hopefully ends the pandemic. Acceptance of vaccines is important as the success of herd immunity depends on the scale of the population vaccinated (Fontanet and Cauchemez, 2020). However, acceptance of the COVID-19 vaccine was claimed varied globally and unpromising in certain areas of the world (Sallam, 2021; Malik et al., 2020). Thus, it is necessary to monitor and understand the sentiments and opinions of the general public to build confidence for vaccination and identify skeptics that lead to a reduction in public confidence (de Figueiredo et al., 2020; Lazarus et al., 2021; Bloom et al., 2020). Investigation on social sharing in the general population is also essential as social interactions, especially dissemination of information, would induce to influence on public perceptions over topics like epidemics (Funk et al., 2009).
Social media allows the population to share their daily happenings, feelings, and thoughts over events within their communities, providing massive textual data for potential sentiment analyses. With a publicly available application programming interface (API) enabling convenient data gathering, Twitter is one of the most widely used and representative social media platforms commonly employed as a data source for text mining and analysis. Due to the social distancing measures to control the spread of the disease, social media usage became even more prevalent, playing a critical role in keeping people connected and informed during the COVID-19 pandemic (Nabity-Grover et al., 2020; Mehla et al., 2021). This results in immense textual data for various text mining or analysis. Traditional surveys typically have small sample sizes, closed questions and limited spatiotemporal granularity. Compared to the traditional surveys, analysis results on social media data grant an overview of the sentiments and opinions of larger communities and changes over time. In order to deal with the drawback of social media data being mostly unstructured, natural language and machine learning algorithms can be employed to mine sentiments and topics from texts.
The main objective of this study was to discover sentiments and identify public opinions related to the COVID-19 vaccine from social media data. Therefore, tweets from Twitter were acquired and analyzed using natural language processing and machine learning algorithms, including sentiment analysis and topic analysis. Temporal changes were also examined to understand the people's view on COVID-19 vaccines throughout pandemics and vaccine development. Evaluation of different sentiment and topic analytical methods applied was the secondary objective. This study was set to contribute to the literature by
- 1)
expanding the understanding of public sentiments and emotions over a crucial sub-topic of COVID-19, vaccination,
- 2)
identifying topics worth public health professionals’ and stakeholders’ notices for critical decision making, and
- 3)
demonstrating and comparing multiple sentiments and topic analysis methodologies on vast social media textual data.