Big Data Collection Opportunities and Problems in Higher Education
The term “Big Data” has become ubiquitous in higher education, especially around discussions of using data to help with student success. But what exactly is big data; we have had loads of data around for a very long time. If you do an online search you will get many different definitions including:
According to Wikipedia, “big data is a term for data sets that are so large or complex that traditional data processing application software is inadequate to deal with them.”
Microsoft states, “Big data is the term increasingly used to describe the process of applying serious computing power— the latest in machine learning and artificial intelligence—to seriously massive and often highly complex sets of information.”
The National institute of Health suggests, “Big Data is more than just very large data or a large number of data sources. Big Data refers to the complexity, challenges, and new opportunities presented by the combined analysis of data.”
One thing they all have in common is that they are large, complex, growing exponentially, and unwieldly. The collection of unstructured data has increased the amount of data collected tremendously. The world creates 100 terabytes of data every day, and it is estimated that 35 zettabytes of data will be created by 2020. A zettabyte is equal to 1 trillion gigabytes or 1021 bytes.
Data is collected from many sources in addition to traditional databases like digital pictures, videos, social media posting, cell phones, web pages, emails, sensors, and many others.
In higher education, we must be careful that we are not trying to find patterns that do not exist
Amazon, Google, Starbuck are just a few examples of companies that collect large amounts of data on our everyday activity. They use it to increase sales while making it easy for us to spend our money with them. The potential of what higher education could do with large amounts of student activity data offers a compelling reason to start collecting more, even without the knowledge of how it could be used. Using data to upsell to students and determine ways to enhance success is at our finger tips. An example is mapping a student’s pattern of going to study hall, tutoring, classes, or even the cafeteria. If the student’s pattern changes it could be a sign of something wrong. A big question to ask is if collecting this data is crossing a line of privacy. Will institutions waver on the edge of paternalism?
According to Scientific America, people have what is called patternicity, we see patterns where they really do not exist. We have heard of people seeing images of Jesus in their toast or a cloud, they may see a pattern in stock market numbers. This is because of the priming effect which helps our brain and senses interpret stimuli based on expected models. Seeing patterns can be very helpful in solving problems; unfortunately, we do not have a detector in our brain that notifies us when a pattern does not really exist. In higher education, we must be careful that we are not trying to find patterns that do not exist.
Education by its nature is all about ethics. We expect students to be honest, do their own homework, and above all not plagiarize. For those in academia who do research, there are tenants that pertain to ethics including informed-consent, respecting confidentiality, and protecting individuals from harm. With this in mind we must make sure that institutions are not collecting data just because we can and it shows a pattern. We must analyze carefully if the interventions we are creating based on patterns found in our data sets are helping students and not just conforming to the expectations of society and the institution.
Higher Education institutions are no different than any other business that needs to survive. Behind student success goals, institutions conform to a system that values students getting good grades and having continued progress toward finishing a degree for the institutions to build revenue and stay in business. Without continued growth in enrollment, and students persisting to graduation, institutions of higher learning will struggle with funding. At the end of the day higher education institutions need to get their product to market, which is graduating students. Understanding why data needs to be collected, what can be determined with it, and how to protect it must be considered before we begin the process of mass collection and analysis.
Getting the Most out of Big Data
Big Data: Separating the Hype from Reality in Corporate Culture
Maintaining Maximum Relevancy for Buyers and Sellers
Building Levies to Manage Data Flood
By Tom Farrah, CIO & SVP, Dr Pepper Snapple Group
By George Evans, CIO, Singing River Health System
By John Kamin, EVP and CIO, Old National Bancorp
By Phil Jordan, CIO, Telefonica
By Elliot Garbus, VP-IoT Solutions Group & GM-Automotive...
By Dennis Hodges, CIO, Inteva Products
By Bill Krivoshik, SVP & CIO, Time Warner Inc.
By Gregory Morrison, SVP & CIO, Cox Enterprises
By Alberto Ruocco, CIO, American Electric Power
By Sam Lamonica, CIO & VP Information Systems, Rosendin...
By Sven Gerjets, SVP-IT, DIRECTV
By Marie Blake, EVP & CCO, BankUnited
By Lowell Gilvin, Chief Process Officer, Jabil
By Walter Carvalho, VP & Corporate CIO, Carnival Corporation
By Mary Alice Annecharico, SVP & CIO, Henry Ford Health System
By Bernd Schlotter, President of Services, Unify
By Bob Fecteau, CIO, SAIC
By Jason Alan Snyder, CTO, Momentum Worldwide
By Jim Whitehurst, CEO, Red Hat
By Marc Jones, Distinguished Engineer, IBM Cloud Infrastructure