Article Preview
Top1. Introduction
Outlier detection of wireless sensor network (WSN) has been an active research area in critical real-life application scenarios like remote patient health monitoring, environmental monitoring, engineering structures, industrial process monitoring, fraud detection, target tracking and military operations. Sensor nodes that are wirelessly interconnected are densely deployed across a geographical area, collecting sensed data and sending it to a central server or sink.The quality of data collected is sometimes unreliable and inaccurate due to the imperfect nature of WSNs, such as low battery power, low memory, and low communication bandwidth (Wang et al., 2006). In the context of WSNs outlier also known as anomaly is defined by Hawkins as “An outlier is an observation that deviates so much from other observations as to arouse suspicion that it was generated by a different mechanism” (D.M. Hawkins, 1980). According to Barnett and Lewis, outlier is defined as “An outlier is an observation or subset of observations that appears to be inconsistent with the rest of the set of data” (Barnett and Lewis, 1994). Sadik, S. et al defined outlier as “An outlier is a data point which is significantly different from other data points, or does not conform to the expected normal behavior, or conforms well to a defined abnormal behavior” (Zhang et al., 2010). Outliers have an impact on the quality of information obtained from WSNs, which are classified as local or global depending on the data.In data dimension scenario the data streams can be univariate having single attribute or multivariate having multiple attributes. Outliers in data occur when there are any deviations in the sensed data that are correlated in both time and space.Temporal correlation implies temporal anomaly due to changes in data over time at a single node location. Spatial correlation denotes spatial anomaly caused by comparison with neighboring nodes, whereas spatiotemporal anomaly is caused by changes in data value over both time and space at a greater number of node locations.
The various sources of outliers are shown in Fig 1. As the nodes are deployed in harsh and hostile environment, faults in WSNs are likely to occur unexpectedly and frequently, ranging from simple permanent faults to the faults where the node behaves maliciously (Mahapatro and Khilar 2013). Faults are likely to occur unexpectedly due to fault in hardware or programs where a node becomes inactive gives erroneous outputs. Noise or error occurs from a noise-related measurement of a faulty. Malicious attacks can be passive attack where the data changed in the network without interrupting data communication or active attack which aims to minimize the functionality of the network by injecting false and corrupted data (Bhushan and Sahoo, 2018). This relates to the network security (Puri and Bhushan, 2019). Because of the low computing power, high-end security solutions cannot be implemented, making the nodes more vulnerable to security threats.Outliers are measurements taken by defective sensor nodes or nodes that have been hacked that vary considerably from the normal pattern of sensed data.Unattended outliers can have hazardous consequences in terms of environmental damage, human life, and economic hardship.Because a substantial number of WSNs will be used in safety-critical applications, outliers may cause misunderstanding or undesired alerts, resulting in life-threatening incidents.So the detection of outlier provides data reliability, effective and secures functioning of the network. Identifying outlier sources, node resource constraints, reducing communication overhead, routing (Bhushan, Sahoo, 2019), processing large amounts of distributed data online due to dynamic change behavior, and frequent communication failure between nodes due to large scale deployment in an unattended environment are major challenges in any outlier detection technique (Mahapatro and Khilar 2011).
Figure 1.
Outlier sources in WSNs (Zhang et al., 2010)