Leveraging Big Data to Facilitate a Better Tomorrow

Kamelia Aryafar, Senior Vice President, Overstock.com
Kamelia Aryafar, Senior Vice President, Overstock.com

Kamelia Aryafar, Senior Vice President, Overstock.com

1. In light of your experience, what are the trends and challenges you’ve witnessed happening for the Big Data landscape?

In the machine learning world, we need large amounts of training data. This can be a challenge when working with data silos across multiple platforms, which requires resources from ETL and data warehouse teams. Data quality and integrity is also a challenge here because it requires resources to clean and tag the data so that the models can perform with a reliable degree of accuracy.

2. Could you talk about your approach to identifying the right partnership providers from the lot?

When evaluating partners in Big Data, three primary considerations are 1.) scalability and performance, 2.) security, and 3.) enterprise support. Unusually for an eCommerce retailer, with factors like Black Friday, we must be able to respond to rapid changes in traffic. Also, we must preserve the user experience, meaning we must maintain speed and avoid outages.

Data quality and integrity is also a challenge here because it requires resources to clean and tag the data so that the models can perform with a reliable degree of accuracy 

Second, I would recommend involving your security team early in the process. They will be able to evaluate and advise on a partner’s policies and procedures around their information systems, incident management, and compliance.

Third, before partnering with a third party, consider the amount of support they can provide, especially if you do not have an internal team that is highly proficient in troubleshooting with that specific tool or platform.

Finally, consider whether you even need a partnership in the first place. Is this solving a user problem, or is this just a shiny new tool that only one or two people are championing for? Have you done a cost-benefit analysis on buying vs. build? Are there open-source options? What features might it offer that you can’t get from an open-source solution?

3. Could you elaborate on some exciting and impactful project/ initiatives that you’re currently overseeing?

I think most forward-thinking companies are working on moving toward enabling real-time event streaming and setting up platforms that support machine learning in production. Having real-time data to be able to better engage with your customer is a game-changer. An effective platform should offer sustainable support, able to handle a large number of decision requests in real-time. This improves the quality of the output of your machine learning models, and in turn, enhances the customer experience.

4. What are some of the points of discussion that go on in your leadership panel? What are the strategic points that you go by to steer the company forward?

One point I like to stress is that the source of truth for data is an agreed upon cause of truth, depending upon the functions of the data. On our team, data governance does not extend past our platform. This is not to say that we aren’t supportive of the organization’s initiatives, as we are all passionate about data integrity. However, a big data team can often only ensure data quality on our pipelines and platforms for the primary consumers of the data. This is because we want to empower the business, so we take the stance of educating consumers, rather than mandating how it is consumed. Consequently, we can’t always ensure the quality of what is produced. This is why it is so important to not conflate data governance with a specific project or initiative. Effective data management is not a one-off project but an

overarching organizational effort. I believe that everyone should be good data stewards and contribute to building a data-driven culture.

5. Can you draw an analogy between your personality traits, hobbies, and how they reflect on your leadership strategy?

I don’t know that many people would consider what I do in my free time as a hobby, but it’s definitely what I enjoy doing, and I believe it’s the foundation of my leadership strategy. You can often find me providing mentorship to some of our up-and-coming data scientists, conducting research, and helping them get published. In effect, my leadership strategy is very research-based with a focus on coaching others. With the big data landscape accelerating so quickly, we need to keep up with new technology and ensure that we are growing others to do so as well.

6. How do you see the evolution of the Big Data industry a few years from now about some of its potential disruptions and transformations?

One transformation I see is a shift in that real-time eventing systems will be first, and batch data systems will be second. Traditionally, companies have valued systems that support business intelligence and reporting first, but I see more and more than companies are prioritizing systems that support machine learning in production. This is because they recognize the quality of insights that come from that strategy as well as the power of automation. I can also see this contributing to an evolution of the role of the data analyst, requiring them to learn more development and data engineering skills.

One disruption I see is in data protection regulations. Data privacy legislation is changing how companies manage data from rights to consent to the deletion of customer data in massive data sets that have persisted for a long time. It will be interesting to see how companies navigate this and what solutions come out of it.

7. What would be the single piece of advice that you could impart to a fellow or aspiring professional in your field, looking to embark on a similar venture or professional journey along the lines of your service and area of expertise?

For an analyst looking to break into big data, expand on your statistics skills, and become proficient in a scripting language such as Python as well as a programming model like MapReduce. For those on the program management or leadership track, go work on the big data side for a while. Shadow a data engineer and build a solid foundation of technical abilities as well as a secure network. Pay attention to industry trends and learn how to navigate the business side by honing your ability to explain technical concepts to nontechnical folks. For an effective data management program, collaboration is critical.

See Also:

Top Big Data Solution Companies

Read Also

Big Data: Separating the Hype from Reality in Corporate Culture

Big Data: Separating the Hype from Reality in Corporate Culture

Brett MacLaren, VP, Enterprise Analytics, Sharp HealthCare
Maintaining Maximum Relevancy for Buyers and Sellers

Maintaining Maximum Relevancy for Buyers and Sellers

Zoher Karu, Vice President and Chief Data Officer, eBay
Building Levies to Manage Data Flood

Building Levies to Manage Data Flood

Adam Bowen, World Wild Lead of Innovation, Delphix
Resolving Disassociated Processing of Real-Time and Historical Data in IoT

Resolving Disassociated Processing of Real-Time and Historical Data in IoT

Konstantin Boudnik, Chief Technologist Bigdata Open Source Fellow, EPAM