Cluster-Based Cab Recommender System (CBCRS) for Solo Cab Drivers

Cluster-Based Cab Recommender System (CBCRS) for Solo Cab Drivers

Supreet Kaur Mann, Sonal Chawla
Copyright: © 2022 |Pages: 15
DOI: 10.4018/IJIRR.314604
Article PDF Download
Open access articles are freely available for download

Abstract

An efficient cluster-based cab recommender system (CBCRS) provides solo cab drivers with recommendations about the next pickup location having high passenger finding potential at the shortest distance. To recommend the cab drivers with the next passenger location, it becomes imperative to cluster the global positioning system (GPS) coordinates of various pick-up locations of the geographic region as that of the cab. Clustering is the unsupervised data science that groups similar objects into a cluster. Therefore, the objectives of the research paper are fourfold: Firstly, the research paper identifies various clustering techniques to cluster GPS coordinates. Secondly, to design and develop an efficient algorithm to cluster GPS coordinates for CBCRS. Thirdly, the research paper evaluates the proposed algorithm using standard datasets over silhouette coefficient and Calinski-Harabasz index. Finally, the paper concludes and analyses the results of the proposed algorithm to find out the most optimal clustering technique for clustering GPS coordinates assisting cab recommender system.
Article Preview
Top

Introduction

Recommender systems are the software tools that recommend the user with a set of personalized suggestions which can be useful to the user. These suggestions help the user with the decision-making process (Ricci et al., 2011). Recommender systems are of much importance in cab services too. Recommender system for cab drivers has always been a major concern as cabs are the main source of transportation in the modern cities compared with the other transportation services like bus, train etc. A recommender system can be constructed using three approaches: Content-Based Filtering (Mooney & Roy, 1999), Collaborative-Based Filtering (Resnick & Varian, 1997) and Hybrid Filtering (Pazzani, 1999). Content-Based Filtering is based on the user’s historical information and hence faces the Cold Start problem. Collaborative-Based Filtering makes an automatic recommendation to a user based on the taste and likings of several other users. Hybrid Filtering combines both of the filtering methods. Cab Recommender system uses Collaborative Filtering. Cab Recommender systems are useful to both the driver and the users (Yuan et al., 2013). It recommends the cab driver with the nearest passenger finding locations from where passengers can be found at a minimum travelling distance and thereby increasing their profit. It also helps the passenger to find a cab near them to save time (Wang et al., 2017). To recommend cab drivers with the next passenger finding location, it is essential to cluster the pickup Geolocations of the same area as that of the cab. There are several clustering techniques to cluster these geolocations. Clustering techniques can be broadly classified into three categories: Hierarchical Methods, Partition-Based Methods and Density-Based Methods (Wang et al., 2017).

Clustering evaluation can be performed either using extrinsic measure or using intrinsic measures. To evaluate a cluster using an extrinsic measure like adjusted rand index, Fowlkes-Mallows score etc. it is essential to have ground truth labels. Since, cab dataset does not have the ground truth labels so extrinsic measures cannot be used to evaluate the cluster performance for the Cab Recommender system. Whereas, to evaluate cluster performance using an intrinsic measure like Silhouette Coefficient (Peter, 1987), Calinski-Harabasz index, Davies-Bouldin Index etc., it is not required to have ground truth labels. Hence, Intrinsic measures such as Silhouette Coefficient, Calinski-Harabasz Index etc can be used to evaluate the cluster performance for Cab Recommender System. For this research paper, Silhouette Coefficient and Calinski-Harabasz Index are used to evaluate the cluster performance for Geolocation clusters as these score works fine with large datasets and are computationally faster than other intrinsic measures (Pedregosa et al., 2011).

Since clustering of geolocations is an unsupervised learning hence it becomes difficult to adopt a standard clustering technique that shall cluster the passenger pickup geolocations. Hence, there is a need for a framework to cluster the passenger geolocations which shall assist the cab recommender system to recommend most near passenger finding locations to the cab drivers.

Therefore, the research paper aims to design and develop an algorithm to generate clusters of passenger pickup geolocations using Hierarchical Clustering such as Balanced Iterative Reducing and Clustering using Hierarchies (BIRCH) and Clustering Using REpresentative (CURE), Partition-Based Clustering such as K-Means, Mini Batch K-Means and Spectral Clustering (Ng et al., 2002) and Density-Based Clustering algorithms such as Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and Ordering Points To Identify Cluster Structure (OPTICS) for CBCRS. The proposed algorithm is rigorously evaluated over three unsupervised datasets of New York, Porto and Mexico Cities over the parameters of Silhouette Coefficient and Calinski-Harabasz Index.

Complete Article List

Search this Journal:
Reset
Volume 14: 1 Issue (2024)
Volume 13: 1 Issue (2023)
Volume 12: 4 Issues (2022): 3 Released, 1 Forthcoming
Volume 11: 4 Issues (2021)
Volume 10: 4 Issues (2020)
Volume 9: 4 Issues (2019)
Volume 8: 4 Issues (2018)
Volume 7: 4 Issues (2017)
Volume 6: 4 Issues (2016)
Volume 5: 4 Issues (2015)
Volume 4: 4 Issues (2014)
Volume 3: 4 Issues (2013)
Volume 2: 4 Issues (2012)
Volume 1: 4 Issues (2011)
View Complete Journal Contents Listing