An Improved Sentiment Analysis Approach to Detect Radical Content on Twitter

An Improved Sentiment Analysis Approach to Detect Radical Content on Twitter

kamel Ahsene Djaballah, Kamel Boukhalfa, Omar Boussaid, Yassine Ramdane
DOI: 10.4018/IJITWE.2021100103
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Social networks are used by terrorist groups and people who support them to propagate their ideas, ideologies, or doctrines and share their views on terrorism. To analyze tweets related to terrorism, several studies have been proposed in the literature. Some works rely on data mining algorithms; others use lexicon-based or machine learning sentiment analysis. Some recent works adopt other methods that combine multi-techniques. This paper proposes an improved approach for sentiment analysis of radical content related to terrorist activity on Twitter. Unlike other solutions, the proposed approach focuses on using a dictionary of weighted terms, the Word2vec method, and trigrams, with a classification based on fuzzy logic. The authors have conducted experiments with 600 manually annotated tweets and 200,000 automatically collected tweets in English and Arabic to evaluate this approach. The experimental results revealed that the new technique provides between 75% to 78% of precision for radicality detection and 61% to 64% to detect radicality degrees.
Article Preview
Top

Introduction

Social networks have become an effective communication channel for, interaction, sharing opinions and spreading ideas, giving birth to the interactive Web 2.0. According to Patard (2020), there are more than 4.5 billion Internet users, including 3.8 billion users of social networks such as Facebook, Twitter, etc. Unfortunately, the emergence of these social networks has initiated a new era of terrorism. These networks are used as a platform to incite terrorist acts, recruitment, and more. Indeed, terrorist groups like the Islamic State of Iraq and Syria (ISIS) propagate propaganda online using various social media forms such as Twitter and YouTube (Ferrara et al., 2016). The social network Twitter, which has about 500 million tweets published per day, is considered one of the platforms most used by terrorists (Reuter et al., 2017). Terrorists and their supporters include sentiments in the content of the tweets when sharing their opinions and comments. Therefore, this radical content analysis is crucial to detect extremist discourse and limit its reach and dissemination (Nouh et al., 2019).

To analyze activities related to terrorism, several works have been proposed. In some of these works, the authors have used sentiment analysis based on lexicon (Azizan and Aziz, 2017; Mansour, 2018) or by using machine learning (Ashcroft et al., 2015; Ferrara et al., 2016; Conde-Cespedes et al., 2018; Abrar et al., 2019). Others have used deep learning (Becker et al., 2019; Ahmad et al., 2019). Two (02) works have adopted data mining techniques (Ali, 2016; Nouh et al., 2019).

Regarding the approaches that combine multi-techniques, as it happens in Asif et al. (2020), Ahmed and Qadoos (2018), in which the authors used a lexicon-based method and machine learning, the approach of Sanchez-Rebollo et al. (2019) combine sentiment analysis and fuzzy clustering, as well as Rattrout and Ateeq (2019), where they used the dictionary terms with the principle of fuzzy logic.

It should be noted that despite the diversity of existing works relating to the analysis of terrorist activities in Twitter, very few of these works have used Wor2vec to obtain: (1) similar terms in a dictionary from the intersection of words existing in tweets and existing dictionary terms, (2) similar names of terrorist organizations/personalities from the intersection of living words in tweets and search keywords (names of terrorist organizations/personalities). Moreover, although some works have adopted Word2vec (Conde-Cespedes et al., 2018; Nouh et al., 2019), like in the proposed approach, they have used this technique to determine similar words in tweets analysis.

In another vein, the sentiment analysis approach adopted in this paper is different from the approaches that existed in the literature, where it was used dictionary terms (e.g., the work of Asif et al., 2020; Rattrout and Ateeq, 2019). In this proposed approach, the authors take into consideration trigrams, formed by weighted dictionary terms and names of terrorist organizations/personalities, therefore it is close to reality. Also, two languages have been used: English and Arabic; the authors have taken the negation measurements into account in the calculation of the scores of the trigrams contained in each tweet.

In summary, for the best of our knowledge, none of the earlier presented works have provided the following contributions:

  • Enrichment of dictionaries terms as well as search keywords using Word2Vec.

  • Detection and classification of the radical content on Twitter using the trigrams method for the two languages English and Arabic.

Complete Article List

Search this Journal:
Reset
Volume 19: 1 Issue (2024)
Volume 18: 1 Issue (2023)
Volume 17: 4 Issues (2022): 1 Released, 3 Forthcoming
Volume 16: 4 Issues (2021)
Volume 15: 4 Issues (2020)
Volume 14: 4 Issues (2019)
Volume 13: 4 Issues (2018)
Volume 12: 4 Issues (2017)
Volume 11: 4 Issues (2016)
Volume 10: 4 Issues (2015)
Volume 9: 4 Issues (2014)
Volume 8: 4 Issues (2013)
Volume 7: 4 Issues (2012)
Volume 6: 4 Issues (2011)
Volume 5: 4 Issues (2010)
Volume 4: 4 Issues (2009)
Volume 3: 4 Issues (2008)
Volume 2: 4 Issues (2007)
Volume 1: 4 Issues (2006)
View Complete Journal Contents Listing