Adapting to Big Data Streaming and AI

By CIOReview | Tuesday, May 1, 2018

Big data streaming is one of the latest technology trends that now requires restructuring of the necessary information architecture fundamentals. In some cases, streaming is the only way to deal with the growth of voluminous data today. And, experts believe that big data streaming needs a fundamentally different approach and it is going to be on the same level with the methodology of extract, transform, and load (ETL). It's not important to substitute ETL methods with streaming across the organization but newer workloads that deal with big data need to have a streaming type of approach.

AI is not just about robots and self-driving cars, it’s also about data. The data for testing and training AI can come from various sources like e-commerce, customer account data, ERP, CRM, call center recordings amongst other sources. It can also come from the internet of things (IoT), streaming sensor data, publicly available or third party information, and all of these can produce good AI algorithms.

For instance, predictive maintenance needs good quality structured data such as time series, event data, and graph data along with unstructured data like text, image, and audio data. Flat files also play a role in this; its additional databases, Hadoop Distributed File System (HDFS) in memory data for high-performance machine learning, and even text-based serializations. Today, data architecture is the basis for AI. For a better AI-enabled system, good quality of data is crucial