What IoT Edge analytics is all about

473796

Image from Iconfinder

Internet of Things (IoT) analytics refers to the collection and analysis of data stemming from a large number of heterogeneous Internet connected objects. IoT analytics is an integral element of the vast majority of IoT applications, which process data in order to offer data-intensive services or to drive actuation and control decisions. IoT analytics systems are typically Big Data systems, since they expose the Vs of Big Data in particular:

  • Volume: They involve multiple sensors and Internet connected objects, which produce continually very large amounts of data.
  • Variety: IoT data stem from numerous heterogeneous multi-vendor devices of many different types.
  • Veracity: Most IoT devices (such as sensors) produce measurements and observations that are associated with uncertainty.
  • Velocity: Sensors and IoT devices produce streaming data with very high ingestion rates.

The Velocity of IoT data streams is usually the attribute that differentiates IoT analytics systems from the majority of conventional BigData systems, which handle large volumes of transactional data. Therefore, IoT analytics systems are usually supported by middleware frameworks for streaming data (such as the open source Apache Storm, Spark and Flink frameworks), rather than the popular MapReduce BigData processing framework.

Given their Big Data nature, IoT analytics systems are usually integrated with Cloud computing infrastructures, in order to take advantage of the scalability, storage capacity and processing performance of the Cloud. However, direct Cloud integration is not the most scalable option in cases where data from millions of geographically dispersed and highly distributed IoT devices needs to be stored and processed. In such cases the streaming of IoT data in a Cloud leads to very high bandwidth consumption, significant network latency and storage of a large amount of information of questionable business value. The emerging edge (or fog) computing paradigm provides a compelling alternative to pure Cloud computing solutions, through enabling processing of data streams at the very edge of the network.

Edge computing involves the deployment of an additional storage and processing layer between the IoT devices and the Cloud. This layer consists of edge (or fog) nodes, which range from IoT-based constrained devices (such as embedded gateways) to entire (small scale) data centers. Edge nodes are deployed close to end-users and IoT infrastructures in order to facilitate processing of IoT data (including mobile users’ data) prior to the integration of these data in the back-end centralised Cloud. IoT data processing at this edge computing layer is conveniently called IoT Edge Analytics. Based on data analysis at the edge, IoT applications can:

  • Filter and reduce the amount of IoT data, which must be stored at the edge nodes or streamed to the back-end cloud.
  • React to the fulfillment of a condition or the identification of an event through triggering some actuation or control functionality.
  • Cache (selected) data streams, notably the ones that are expected to be frequently accessed and used by other systems or applications.
  • Configure privacy settings relating to storage and access to a users’ personal data.

Edge analytics functionalities deliver tangible benefits to developers, deployers and operators of IoT analytics systems, including reduction in network bandwidth and respective costs, reduction of storage costs, reduction of latency costs associated with IoT data collection and analysis, as well as enhanced privacy for end-users. Furthermore, edge analytics is a must for real-time control applications where high-latency cannot be tolerated, as well as for mobile scenarios involving roaming users who provide their data at the edge node of their closest proximity.

Edge analytics is usually only one part (subsystem) of a fully fledged IoT analysis system. The latter involves also a back-end cloud where information from multiple edge nodes is collected and processed as part of a second centralized cloud analytics subsystem. IoT analytics systems involve continuous interactions between their “edge analytics” and “cloud analytics” subsystems, since there are several use-cases where processing at the Cloud (e.g., enterprise data processing in manufacturing) can trigger edge analytics algorithms and drive control decisions at the IoT layer (e.g., configuration of a production line) and vice versa.

Despite the importance of edge analytics systems for IoT, their development is still in early stage. While there are several enterprise products that distribute, deploy and use streaming middleware engines across multiple edge nodes, there is still a lot of room for improving the performance and the latency of distributed queries over these engines. Depending on the target application, relevant optimizations should take into account criteria such as the geo-location of IoT nodes, the criticality of specific IoT devices and their streams, as well as traffic or delay-related limitations of specific sensors or actuators. In the next couple of years we can expect the emergence of a host of novel products for efficient IoT edge analytics.

John Soldatos is an Internet of Things, Cloud Computing, JavaEE conultant, writer and published author.

All information/views/opinions expressed in this article are that of the author. This Website may or may not agree with the same.

Save

Save

Save

Save

Save

Save

Save

Save

Leave a Reply

Click here to opt out of Google Analytics