Fuzzy SVM With Mahalanobis Distance for Situational Awareness-Based Recognition of Public Health Emergencies

Fuzzy SVM With Mahalanobis Distance for Situational Awareness-Based Recognition of Public Health Emergencies

Dan Li, Zheng Qu, Chen Lyu, Luping Zhang, Wenjin Zuo
Copyright: © 2024 |Pages: 21
DOI: 10.4018/IJFSA.342117
Article PDF Download
Open access articles are freely available for download

Abstract

In public health emergencies, situational awareness is crucial for swift responses by governments and rescue organizations. In this manuscript, a novel framework is proposed to identify and classify event-specific information, aiming to comprehend concepts, characteristics, and classifications associated with situational awareness in social media emergencies. First, a statistical approach is employed to extract a set of standard features. Second, a category-based latent dirichlet allocation to vector (LDA2vec) model is leveraged to extract topic-based features to enhance accuracy, particularly for unbalanced datasets. Finally, a fuzzy support vector machine (FSVM) classifier utilizing the Mahalanobis distance kernel is introduced to improve the detection accuracy of event-specific information. The framework's effectiveness is evaluated using the social media public health dataset, achieving superior filtering capabilities for non-informative data with a precision of 89% and an F1-Score of 91%, surpassing other standard methods.
Article Preview
Top

Introduction

Public health emergencies are characterized by their suddenness, speed, and unpredictability, presenting significant challenges to emergency management (An et al., 2018). Governments and voluntary relief organizations should strive to collect and understand relevant disaster information to aid emergency response operations (Fu et al., 2020). Situational Awareness (SA) (Huang & Xiao, 2015), which involves gathering and comprehending relevant crisis information (i.e., what is occurring in impacted communities during an event), is critical to this process. Social media has become a primary mode of disseminating information online, owing to its speed, versatility, and interactivity. It also serves as a substantial communication channel, particularly for situational awareness during emergencies like natural calamities.

However, due to the diversity of online user communities, online news content varies widely, posing challenges for relevant agencies in swiftly gaining situational awareness of events. The key to enhancing the speed of emergency response to unforeseen events and minimizing associated losses is efficiently collecting pertinent information related to situational awareness from vast amounts of data in the shortest possible time frame. The use of social media for situational awareness during unforeseen events typically involves tasks such as social media text classification and semantic mining, which includes parsing concise and informal messages, managing information overload, and prioritizing different types of information identified within these messages. These tasks can be mapped to classical information processing operations, such as filtering, categorization, sorting, aggregation, extraction, and summarization (Imran et al., 2015; Liang & Li, 2020).

In recent years, an increasing number of scholars have explored the implementation of techniques, including natural language processing, machine learning, and deep learning, for the automated processing of social media breaking news messages (Xia et al., 2021). However, there still needs to be a framework to identify and classify event-specific information, primarily due to the complexity of the task and the dynamic nature of online information (Nan et al., 2022). Such a framework is necessary for the ability of relevant agencies to efficiently process and make sense of the vast amount of data generated during unforeseen events. This gap in existing methodologies arises from the following two factors: the complexity of information and unbalanced data issues.

On the one hand, the diverse and dynamic nature of online content, particularly during emergencies, poses challenges in developing a comprehensive framework. The sheer volume of information and the rapid evolution of events demand a sophisticated approach to extracting relevant details. On the other hand, dealing with unbalanced data sets, where informative and non-informative data may be unevenly distributed, adds another layer of complexity. A robust framework should account for these imbalances to ensure accurate and unbiased results.

In light of these challenges, developing a comprehensive and adaptable framework becomes imperative. This paper proposes an automated and comprehensive framework for identifying informative information. Using situational awareness, our approach aims to comprehend the concepts, features, and categories of informative information related to public health emergencies on social media. First, we define the concepts, characteristics, and components of informative information regarding public health emergencies based on situational awareness on social media. Second, we employ statistical methods to extract traditional features, including linguistic, numeric, punctuation, and source-based features. Third, we enhance our framework by extracting topic-based features using a category-based latent Dirichlet allocation to vector (LDA2vec) model, tailored explicitly for addressing unbalanced data sets. Finally, we introduce a fuzzy support vector machine (FSVM) classifier designed to handle unbalanced and noisy data, utilizing a kernel based on Mahalanobis distance rather than the traditional Euclidean distance kernel. The effectiveness of our framework is assessed through comparisons with traditional machine learning models and state-of-the-art methods. To further validate our approach, we leverage a BERT pre-trained model to cluster the classification results, demonstrating that informative information has a superior situational awareness effect.

The main contributions of our work could be summarized as follows:

Complete Article List

Search this Journal:
Reset
Volume 13: 1 Issue (2024)
Volume 12: 1 Issue (2023)
Volume 11: 4 Issues (2022)
Volume 10: 4 Issues (2021)
Volume 9: 4 Issues (2020)
Volume 8: 4 Issues (2019)
Volume 7: 4 Issues (2018)
Volume 6: 4 Issues (2017)
Volume 5: 4 Issues (2016)
Volume 4: 4 Issues (2015)
Volume 3: 4 Issues (2013)
Volume 2: 4 Issues (2012)
Volume 1: 4 Issues (2011)
View Complete Journal Contents Listing