A Firefly Algorithm-Based Approach for Web Query Reformulation

A Firefly Algorithm-Based Approach for Web Query Reformulation

Meriem Zeboudj, Khaled Belkadi
Copyright: © 2022 |Pages: 16
DOI: 10.4018/ijirr.299939
Article PDF Download
Open access articles are freely available for download

Abstract

A major difficulty in using a web-based information retrieval system is the choice of terms to be used for expressing and processing a query. The user has to examine a lot of data to find the necessary documents or information. The problem that often appears in this situation is that the query is incorrect and does not express those needs. Researchers have come up with various solutions to overcome this problem among them the use of query reformulation. This paper presents an approach called FA-QR based on this technique using the Firefly metaheuristic. This algorithm was applied to frequent itemsets generated by frequent- pattern growth (FP Growth). The algorithmic solution allowed the user to select the best path among all the possible solutions for the initial query. Experimentally, the results demonstrated that our proposed approaches achieved a significant improvement over other different methods on TREC and FIRE datasets.
Article Preview
Top

Introduction

The information search on the Web engages the users in a questioning process about the choice of search engines. Besides, if the queries results do not express their needs or out of their objectives, this implies that some information is not well formulated. This leads us to ask two important questions: How can more pertinent documents be found to a given query? And, how can the user's query be better expressed to better meet one’s needs?

The reformulation concept is both an iterative and an interactive process between the user and the search engines to achieve satisfactory results (Lu, Wei, Sun, Li, Wen, & Zhou, 2018). The retrieved results play a significant role in the reformulation strategy. The accuracy of the query suggestion phase is entirely based on either the results (of the documents or the URLs) and the new extracted keywords. This concept has been studied by many researchers to become one of the most known concepts in the fields of Information Retrieval (IR) since it has been the subject of a lot of works that provided solutions to users according to their information needs. The idea of web search improvement by query modification was studied in (Efthimiadis, 1996). Some improvement techniques involve either adding to these queries the existing terms from linguistic resources as it is in the WordNet (Azad & Deepak, 2019), or building resources from the collections (Aminu, Oyefolahan, Abdullahi, & Salaudeen, 2019).

Another widely used and one of the most popular techniques is the one called Pseudo Relevance Feedback (PRF). It is based on the assumption that the ranked top-k documents are considered relevant to the query (Vaidyanathan, Das, & Srivastava, 2016; Xu, Lin, Lin, Yang, & Xu, 2018). For this reason, several papers were proposed to improve the classification of the document compared to a first search (Arampatzis, Peikos, & Symeonidis, 2021; Khennak & Drias, 2017a; Valcarce, Parapar, & Barreiro, 2019). In (Keikha, Ensan, & Bagheri, 2018), the authors have chosen Wikipedia as a source for extracting the relevant articles and used supervised and unsupervised methods for selecting the candidate expansion terms. Researchers in the paper (Azad & Deepak, 2019) used also Wikipedia combined with WordNet as data sources for expansion terms. The approach considers the individual terms and phrases as the expansion terms. This combination of the two data sources gave good results when compared to the two methods individually.

Complete Article List

Search this Journal:
Reset
Volume 14: 1 Issue (2024)
Volume 13: 1 Issue (2023)
Volume 12: 4 Issues (2022): 3 Released, 1 Forthcoming
Volume 11: 4 Issues (2021)
Volume 10: 4 Issues (2020)
Volume 9: 4 Issues (2019)
Volume 8: 4 Issues (2018)
Volume 7: 4 Issues (2017)
Volume 6: 4 Issues (2016)
Volume 5: 4 Issues (2015)
Volume 4: 4 Issues (2014)
Volume 3: 4 Issues (2013)
Volume 2: 4 Issues (2012)
Volume 1: 4 Issues (2011)
View Complete Journal Contents Listing