Article Preview
TopIntroduction
Information-seeking is a fundament endeavor of human being and several information search systems has been deigned to assist a user to pose queries and retrieves informative data to accomplish search goals. The traditional systems strongly trust user’s capability of phrasing precise request and perform better if requests are short and navigational. A potential obstacle to such systems is an astonishing rate of information overload that makes difficult to a user for identifying useful information. Therefore nowadays, search focus is shifting from finding to understanding information (White & Roth, 2009), especially in discovery-oriented search. When a user wants information for learning purpose, decision making or other cognitive activity, the conventional search methodologies are not capable to assist, though data exploration is helpful. A data exploration synthesis focused search and exploratory browsing, to discover the interesting data objects. Though, exploration become a recall-oriented navigation over complex and huge datasets using short typed ill-phrased data request (Idreos, Papaemmanouil, & Chaudhuri, 2015; White, 2016; Marchionini, 2006), and thus requires strong support for adaptive relevance measures in retrieval framework (Nandi, & Jagadish, 2011).
In the data deluge, retrieval of relevant data requires either formal awareness of complex schema and content for the formulation of a data retrieval request or assistance from information system (Kersten, Idreos, Manegold, & Liarou, 2011; Huston, Culpepper, & Croft, 2014). For both situations, the system employs implicit measures to outline matched objects and explicit measures to eventually steer search towards a region-of-interest. Most existing retrieval models score a document predominantly on documents-terms statistics, i.e. document lengths, query-term frequencies, inverse document frequencies, etc (Van, 1977; Daoud & Huang, 2013). Intuitively, the query terms proximities (QTPs) within pre-fetched result set/documents could be exploited for re-position/re-raking of the documents/results in which the matched query terms are close to each other. For example, an information search considering the query ‘exploratory search’ on two documents, both matching the two query terms once:
Doc1: {…exploratory search………}.
Doc2: {….exploratory….search….}.
Intuitively, document1 should be ranked higher, as occurrences of both query terms are closest to each other. In compare to the document 2, where both query terms are far apart and their combination does not necessarily imply the meaning of ‘exploratory search’.
The term-term affinity within matched document has role to play during the retrieval and eventually to position the document in appropriate relevance (Salton & Buckley, 1988; Borlund, 2003; Verma, 2016). For an information search, a user specify data request in more than one terms with an anticipated inherent closeness. The closeness in query terms characterizes structural constraints of a user query and the importance between two matched documents in an information-seeking. The query term proximity is one measure, however, has been principally under-explored in traditional retrieval framework and models; mainly due to intrinsic design concerns (how we can model proximity) and its overall usability (what it serve) into a retrieval model.