Harish Butani, Co-Founder & CTOToday, most organizations are investing heavily to have an effective Business Intelligence (BI) system in place. But the lack of performance in executing SQL drilldown, slice and dice and OLAP/MDX style queries on Hadoop scale data, poses a significant challenge in taking their BI investments to fruition. Having been part of the BI and analytics industry for over two decades, experts at Sparkline Data are harnessing the trend of increased use of Hadoop/Spark in BI environments. The company leverages Apache Spark and Druid to offer a BI platform that enable businesses to improve their decision-making.
“We are changing how BI works on large datasets. Sparkline Accelerator is our first step in executing interactive BI queries in seconds on Hadoop scale data,” says Sri Desikan, Founder, Sparkline Data. The company supports live BI analysis on terabyte scale datasets with support for Tableau for visualization. By leveraging OLAP indexing, the accelerator speeds up the query time on terabyte data volumes. This helps businesses avoid spending time and resources in building intermediate tables and improve the efficiency and speed of BI systems while keeping a check on costs. “Our approach in speeding up BI revolves around leveraging OLAP indexing, simplifying the ETL and building on the Spark platform to enable users to leverage tools like Tableau for their visualization needs,” says Harish Butani, Co-founder and CTO, Sparkline Data. “Spark is the key to our long-term strategy because its architecture allows extensions to be plugged in that benefit all workloads: relational, machine learning, and graph operations.”
Interestingly, Sparkline Data’s commitment for Spark and open source is evident from the fact that both Harish Butani and Laljo John Pullokkaran, Co-founder, Sparkline Data are Apache Hive Project Management Committee members. Driven by this expertise, Sparkline Data has been helping companies from media, ad tech, and telecom to solve their analytics challenges.
We are changing how BI works on large datasets. Sparkline Accelerator is our first step in executing interactive BI queries in seconds on Hadoop scale data
In one instance, one of their clients had a large data lake in Hadoop and ETL extracts of several summary tables, Tableau extracts and more. The analytics reporting was set up in Tableau. The challenge was creating numerous materialized aggregated tables not only affected their ability to analyze data over a wide time period but also was expensive and error prone. Further, there was always a tradeoff between speed and data size, which caused even smaller extracts. By implementing Sparkline Accelerator, the ETL pipeline was simplified, helping the customer to run analytics on live, lower grain data spanning multiple years resulting in high ROI.
To take their commitment forward by making BI even more efficient, Sparkline Data is also creating a platform that promises a high level abstraction for customers to express advanced BI queries, business KPIs and metrics centrally and enable IT and engineering to manage a single BI platform on Hadoop data. With such arrangements, a business user can still use their existing BI tools like Tableau or even Excel against live Hadoop data. “This smart BI platform will facilitate interactive adhoc analysis on large datasets while automating the management of intermediate tables,” says Butani.
“We are into the BI domain for long haul,” proclaims Butani. Driven by this zeal to help clients get complete value from their BI investments, Sparkline Data is set to move ahead by expanding their expertise and building smart BI products.