Citus Data: Building a Cost Effective Scalable Data Platform

CIO VendorUmur Cubukcu, President
Years down the lane, even while managing multiple roles in diverse companies, the Stanford University camaraderie stood steadfast when Umur Cubukcu decided to jointly float a company with his friends to analyze data and garner valuable insights. Echoing the market’s voice, Umur Cubukcu, Co-Founder and President, Citus Data reflects on the need of enterprises to achieve a high level horizontal scale out to accommodate the ever expanding realm of Big Data volume and diversity. “We started Citus Data to provide a cost effective and simple way to explore growing data volumes and new data types, without throwing away all the great database capabilities built over the last twenty years,” says Cubukcu.



CitusDB merges the reliable enterprise features of PostgreSQL with Hadoop-like scalability, offering Big Data analytics companies worldwide a simple and powerful analytics database



PostgreSQL, a leading open source object-relational database with more than 15 years of strong reputation for reliability and enterprise functionality, plays a central role in Citus Data’s approach. PostgreSQL comes with a rich ecosystem that easily extends the database with new structured and semi-structured data types, SQL operators, and storage formats–providing the much-needed flexibility that enable enterprises to continually adapt to evolving needs. With CitusDB, Citus Data scales out PostgreSQL across multiple cores and hundreds of commodity machines, while staying true to the core PostgreSQL project and its evolving ecosystem at every release.
“At its core, CitusDB merges the reliable enterprise features of PostgreSQL with hadoop-like scalability, offering Big Data analytics companies worldwide a simple and powerful analytics database,” says Cubukcu.

CitusDB enables realtime responses to ad-hoc SQL queries, and can be either run as a standalone cluster for simplicity and Performance, or natively on Hadoop for leveraging the Hadoop stack. Citus Data, with its recent CitusDB version 3.0 and columnar storage extension, enables support for a wide variety of analytic workloads.

Customers across multiple verticals including retail, mobile analytics, adtech, ecommerce and security rely on CitusDB for Flexible and fast access to large volumes of data. In particular, CitusDB powers interactive analytics and dashboards on rapidly growing time-series data such as clickstream, ad impressions, network logs, sensor data and other machine-to-machine data. Exploring terabytes of data within seconds using industry standard visualization tools delights end users, and empowers analysts to deliver new insights.

The company’s esteemed clientele include Migros and Agari Data, among others. Migros, an Istanbul based supermarket chain with more than 1,000 stores, uses CitusDB to analyze all movements of its fleet of trucks. CitusDB stores data sent via satellite regularly by each truck, and powers an interactive Tableau dashboard and dynamic maps. These capabilities empower Migros to drive down supply chain costs by identifying the most congested routes and assessing the efficiency and costs of its distribution centers. Agari Data, having analyzed over 1 trillion e-mail messages, secures more than 85 percent of U.S. consumer e-mails, and relies on CitusDB at the core of its SaaS security solution. “CitusDB works extremely well for us,” remarks Vidur Apparao, CTO of Agari. Citus Data will continue to focus on expansions in existing and new verticals and plans to grow in new geographies as well. Citus Data’s vision, Cubukcu says is in “Leveraging the extensibility and reliability of PostgreSQL with the performance and economics of horizontal scalability. Our combined offering is a simple, powerful and versatile data platform for users worldwide.”

For more info please log on to www.citusdata.com

Company
Citus Data

Headquarters
San Francisco, CA

Management
Umur Cubukcu, President

Description
Provides a horizontally scalable PostgreSQL database that runs SQL queries over very large data sets in real-time.