Resolving Disassociated Processing of Real-Time and Historical Data in IoT
With the increasing pace of digital disruption, many enterprises today are focused on staying ahead with real-time data and real-time insights. Real-time analytics provides an opportunity to make proactive decisions, eliminate risks and gain competitive advantage in the marketplace by allowing companies to react quickly to changing conditions by tapping into data that’s always on.. For example, if you’re launching a healthcare app, you could use real-time monitoring information to provide early warning signs and save patients’ lives. But the challenge of processing real-time data from a variety of sensors and mobile and remote devices, in combinations with historical datasets to enhance actionable insights, can add more complexity into already sophisticated pipelines. There are a few solutions to adding two data-flows, requiring the same data to be processed twice, to the same pipeline.
The most advanced solution is to combine the batch, stream processing, and serving DBcomponents intoan in-memory data fabric. A system like this will transparently work with the stored and streamed data at the same time in a transactional fashion. Therefore, the data fabric becomes the single source of truth. Alexandre Boudnik, a computer scientist who has worked on compilers, hardware emulators and testing tools for over 20 years, came up with the term Iota architecture (Greek letter i).
The most advanced solution is to combine the batch, stream processing, and serving DB components into an in-memory data fabric
One example of this solution is using Apache Kafka, in combination with Apache Ignite, to provide messages serving and to process the streaming in combination with the data retained in the secondary storage (Apache Cassandra, HDFS, or even a traditional RDBMS server).Feature rich in-memory data-fabrics like Ignite:
• Behave as a data sink with persistence guarantees either in a secondary database storage or a distributed file system
• Implement distributed computation models for stateful CEP and streaming
• Expose APIs for applications written in Java, Groovy, Scala and other languages
• Provide complete support for SQL querying including indexing, distributed joins, and more
Another approach that was adopted earlier is known as Lambda architecture (Greek L), where two intake layers deal with the incoming data at different speeds,
reconciling at the query point (commonly called serving DB). While offering better delivery SLAs, it isn’t free of interoperation impedance and high operational, hardware and management costs. One particular issue experienced is the correct recovery following the failure of an intake. The recovery logic is frequently moved to the client software forcing it to be stateful and more complex as a result. Making changes to and deploying the stateful code in a distributed system could be quite an intricate undertaking, especially with a need for data reprocessing once the new code is provisioned and running. One possible optimization, sometimes dubbed as Kappa architecture (Greek K), is to combine the batch and stream processing components into a single sub-system, which is then used by the serving DB in the querying. Some telecommunications companies are using this kind of processing to capture and consume data with sensor feeds and telemetry through Apache Kafka and then piped into a streaming dataflow engine, like Apache Flink, for analytics.
The more advanced solution, Iota architecture, is the most recommended. The Iota design pattern has a number of unique properties: elimination of the need for expertise spanning multiple programming models and platforms, reduced hardware needs, low data-center operational complexity, a shorter application development and deployment loop, and a low-cost, long term ownership. This combination helps to increase the data platform ROI by bringing down the capital expenditures into rapidly commoditized computer systems and using smaller development and cluster operation teams.
Getting the Most out of Big Data
Big Data: Separating the Hype from Reality in Corporate Culture
Maintaining Maximum Relevancy for Buyers and Sellers
Building Levies to Manage Data Flood
By Tom Farrah, CIO & SVP, Dr Pepper Snapple Group
By George Evans, CIO, Singing River Health System
By John Kamin, EVP and CIO, Old National Bancorp
By Phil Jordan, CIO, Telefonica
By Elliot Garbus, VP-IoT Solutions Group & GM-Automotive...
By Dennis Hodges, CIO, Inteva Products
By Bill Krivoshik, SVP & CIO, Time Warner Inc.
By Gregory Morrison, SVP & CIO, Cox Enterprises
By Alberto Ruocco, CIO, American Electric Power
By Sam Lamonica, CIO & VP Information Systems, Rosendin...
By Sven Gerjets, SVP-IT, DIRECTV
By Marie Blake, EVP & CCO, BankUnited
By Lowell Gilvin, Chief Process Officer, Jabil
By Walter Carvalho, VP & Corporate CIO, Carnival Corporation
By Mary Alice Annecharico, SVP & CIO, Henry Ford Health System
By Bernd Schlotter, President of Services, Unify
By Bob Fecteau, CIO, SAIC
By Jason Alan Snyder, CTO, Momentum Worldwide
By Jim Whitehurst, CEO, Red Hat
By Marc Jones, Distinguished Engineer, IBM Cloud Infrastructure