Article Preview
Top1. Introduction
In the recent time social media data is used for extraction of useful information such as patterns detection, trends analysis and event identification (Cui, Xiaohui, Nanhai Yang, Zhibo Wang, Cheng Hu, Weiping Zhu, Hanjie Li, Yujie Ji, and Cheng Liu., 2015). Social networking sites, like Twitter (Sangeeta Grover and Gagangeet Singh Aujla, 2015) a source provider of epidemic related information, enables public health officials to take early disease control measure in prone locations at right time. Mining social media data emulates the effect and spread of chikungunya outbreak in Delhi. Epidemic outbreak related data are generated in large numbers in Social media platform like Twitter. Twitter act as a platform which provide meaningful and useful information from unstructured data generated. Unlike other information sources, Twitter provides real-time data and exhibits ongoing events and latest updates around the world (Kumar, Shamanth, Fred Morstatter, and Huan Liu, 2014).The data generated has been analyzed not only to have a glimpse of public opinion, but also monitoring diseases and for providing health-related services efficiently at minimal cost.
Chikungunya, an Alpha-virus, communicated by “Aedes” mosquitoes. Every year, a large number of people get affected by these mosquitoes and puts load on health care. Chikungunya is a virus, which is major cause of concern to public health in India. Chikungunya, which has never really been a big worry in north India, but the abrupt rise in Chikungunya, which has never been a great concern in the north India, but the sudden rise in this epidemic cases in Delhi and other parts of north India has shown in fig.1. According to the report, North Delhi has been the worst affected area this year i.e., 2016 (Chikungunya, dengue sting India, 2016) (Saxena, A., Goyal, L. M., & Mittal, M. 2015) (Figure 1).
Figure 1. Delhi’s sudden rise in Chikungunya cases
The paper is divided into different sections. Section 2 gives a brief about chikungunya and social media platform twitter. Section 3 listed the background study on the topic. Section IV explores the overall process of analyzing social media data. Section V we describe the main challenges faced while analyzing epidemic
Top2. Background
2.1 Chikungunya
Chikungunya virus is transmitted when an infected “Aedes aegypti” mosquitoes bites a human, that was first found in modern day Tanzania in 1952–1953 (HO, World health day, 2014). Chikungunya virus (CHIKV) belongs to genus “Alpha-virus”, “Toga-viridae” family, the name 'Chikungunya' has been acquired from a word in the “Kimakonde” language, which defines “that which bends up” i.e. which becomes contorted. The name suggests the stooped appearance of the sufferers due to joint pain (arthralgia).Symptoms of a chikungunya infectioncome into sight after 4-7 days being bitten by the infected mosquito. Chikungunya is an infectious disease characterized by “fever”, “arthralgia”, and “myalgia”; found in all age group but severe and complex cases are often seen in children and old age group. According to the World Health Organization the disease is commonly found in tropical and sub-tropical regions and places where safe drinking water availability and sanitation systems accessibility is difficult for mass population. According NVBDCP report, 10851 chikungunya suspected cases found till October 2016; big upsurge due to chikungunya is being going on in the city of Delhi.
Twitter is a social media website that lets individuals to post 140 character messages (and images) to the world (Ritterman J, Osborne M and Klein,2009) . Twitter used small blogs or messages that focus on one trending topic at a particular time. Hash tag, # are used to discussed a topic such as #Indian politics, # solar eclipse. Twitter provides an API, through this API a twitter developer can access the tweets. Through this API, developer can access text and meta-data of tweets.Extracted tweets are structured in a particular format. Tweets have the text information/blog, location, author name and a particular tweet ID. The text/blog are further processed and converted into structured data for mining the information.