
Leveraging Biomedical Big Data: A Hybrid Solution


Bryon Campbell, Ph.D., CIO, Van Andel Institute
Big data can be a lifesaver…literally. In fact, the efficient handling and analysis of large, complex datasets in biomedical research plays an integral role in developing new ways to prevent, diagnose and treat diseases.
Scientists engage in a wide range of data-intensive research projects using high-resolution imaging, genomic sequencing instruments and molecular modeling simulations to detail processes such as gene expression and protein behavior.
Because the research conducted in just one laboratory can produce billions of data points, and research techniques evolve at a rapid pace, it is increasingly important for research facilities to architect solutions that can scale as requirements change.
At Van Andel Institute (VAI)—a nonprofit biomedical research and science education organization in Grand Rapids, Michigan— we have managed these big data challenges by embracing cloud computing and implementing a hybrid OpenStack high performance computing (HPC) system. This new infrastructure significantly improves our IT flexibility, while providing users cutting-edge computational resources. The solution saved us roughly two years of development time.
Anticipating Technological Change
The Institute is home to 28 principal investigators and their laboratories that study epigenetics, cancer and neurodegenerative diseases such as Parkinson’s, and are dedicated to translating those findings into effective therapies.
As VAI has increased collaboration with other research institutions and large scale bioinformatics projects around the world, scientific investigations have become much more complex. In recent years, this has led to the formation of research groups that require the ability to work on terabyte and petabyte scale data projects. In addition to having the storage and CPU capability to process and analyze scientific big data, we also wanted a creative approach to future-proofing our inevitable need for more computational resources.
It is thrilling to watch highly efficient computing accelerate scientists’ ability to determine errors in cellular processes that lead to diseases
In 2014, we realized that the science at the Institute was driving the need for exponentially higher volume, higher-speed computational resources. We knew that cloud-based, high-performance computing would soon be the new standard. And although the big players in cloud computing had lowered their prices in recent years, we needed to have a computing solution in-house that gave our scientists direct access to higher speeds.
The continual on boarding of big data-dependent scientists with very diverse system requirements and aggressive timelines meant that we had to explore alternative ways to deliver computing resources to our users.
VAI’s relatively small size and the fact that there would be no legacy equipment to work around made us agile enough to consider a hybrid system with the flexibility to work locally and virtually. In early 2015, our team began implementing a HPC hybrid system that would include three key components—Bright Computing Cluster Manager with OpenStack software; 43 compute nodes, representing 1100 CPU cores, provided by Silicon Mechanics; and parallel (GPFS) storage supplied by Data Direct Networks.
The new system needed to be implemented within a few months—a timeline that would be unreasonable for most large universities or big businesses looking to accomplish the same type of transition.
We are talking about the difference between turning a cruise ship and a speed boat. Although larger organizations would benefit from new approaches, they often are slowed down by established processes and existing equipment. The Institute is very nimble—our structure allows us to transition quickly with no major engineering changes.
A Smooth Implementation
The hybrid HPC cluster and private cloud went live in September 2015, with very few changes from the initial plan to the final implementation. A near flawless execution was important because even small issue could mean the delay of important research.
Because VAI scientists expect future research to be even more data intensive, the system was designed and built with the flexibility to easily bolt on additional resources.
Cloud-based users and cluster-based users at the Institute are now operating simultaneously in a hardware environment that allows for fast access to very large data sets. Administrators also have clear visibility of the Institute’s local cloud and can easily fine-tune the user mix as needed.
We made strategic decisions when executing this HPC hybrid system in order to keep computational accessibility at the forefront. Although others use public cloud providers, a local solution was the best choice for us because of our high-volume instruments and exabyte-scale inter-node traffic. With our current hybrid approach, we enjoy the benefits of local infrastructure while still having the flexibility and ease of use that cloud computing provides.
Big Data Making a Big Impact
This HPC solution is accelerating research at the Institute. VAI scientists are able to analyze data in new ways and expedite the process of transforming hypotheses into advances in medicine that can ultimately save lives.
The system also allows research teams to do more calculated work by giving them the time and ability to cross validate data. Thoroughness and precision in completing data analysis, in turn, facilitates more accurate laboratory testing.
Because we’re not paying per hour or based on frequency of access for computing power, our scientists have more freedom to test and explore biological systems more thoroughly and investigate hypotheses in more efficient ways. It is thrilling to watch highly efficient computing accelerate scientists’ ability to determine errors in cellular processes that lead to diseases.
This hybrid HPC solution significantly shortens data processing time and enables the development of new ways to manipulate and visualize big data. As data analysis techniques evolve, VAI looks to channel the power of diverse, multimodal biological information into powerful research directions, and provide incredible opportunities for our scientists to have a lasting impact on the future of human health.
See Also:
Top Big Data Solution Companies
Top Big Data Consulting Companies
ON THE DECK
Featured Vendors
Next Level Business Services (NLB): Applying Digital Transformation to Create Supply & Service Value Chains of the Future
Gerber Technology: Reshaping the Dynamics of the Fashion & Apparel and Flexible Materials Industries
FileFacets: A One-stop Solution for Locating and Identifying Data Across the Enterprise" title="Jennifer Nelson, VP, Sales & Marketing" style="float:left; margin-right:10px; margin-bottom:20px;" width="60px" height="50px">
FileFacets: A One-stop Solution for Locating and Identifying Data Across the Enterprise
Infoworks: Dynamic Data Warehousing on Hadoop that Automatically Ingests and Organizes Enterprise Data for All Use-cases
ThetaRay: Advanced Data Analytics Provide an Enhanced Security Layer to Combat Bank Fraud and Cybercrime
VentureSoft Global: Robust Big Data Solutions for Customer, Product Profitability and Operational Efficiency
Absolut-e Data Com BizStats – Leveraging Artificial Intelligence To Extract The True Potential Of Data
Relational Solutions, Inc.: Delivers Enterprise Demand Signal Repositories to the Consumer Goods Ind
Emagine International: Adaptive Contextual Marketing Platform for Personalized Customer Interactions
Cygnus Professionals: Translate Big Data into Actions: An Analytics Platform Transforming Enterprise
EDITOR'S PICK
Essential Technology Elements Necessary To Enable...
By Leni Kaufman, VP & CIO, Newport News Shipbuilding
Comparative Data Among Physician Peers
By George Evans, CIO, Singing River Health System
Monitoring Technologies Without Human Intervention
By John Kamin, EVP and CIO, Old National Bancorp
Unlocking the Value of Connected Cars
By Elliot Garbus, VP-IoT Solutions Group & GM-Automotive...
Digital Innovation Giving Rise to New Capabilities
By Gregory Morrison, SVP & CIO, Cox Enterprises
Staying Connected to Organizational Priorities is Vital...
By Alberto Ruocco, CIO, American Electric Power
Comprehensible Distribution of Training and Information...
By Sam Lamonica, CIO & VP Information Systems, Rosendin...
The Current Focus is On Comprehensive Solutions
By Sergey Cherkasov, CIO, PhosAgro
Big Data Analytics and Its Impact on the Supply Chain
By Pascal Becotte, MD-Global Supply Chain Practice for the...
Technology's Impact on Field Services
By Stephen Caulfield, Executive Director, Global Field...
Carmax, the Automobile Business with IT at the Core
By Shamim Mohammad, SVP & CIO, CarMax
The CIO's role in rethinking the scope of EPM for...
By Ronald Seymore, Managing Director, Enterprise Performance...
Driving Insurance Agent Productivity with Mobile and Big...
By Brad Bodell, SVP and CIO, CNO Financial Group, Inc.
Transformative Impact On The IT Landscape
By Jim Whitehurst, CEO, Red Hat
Get Ready for an IT Renaissance: Brought to You by Big...
By Clark Golestani, EVP and CIO, Merck
Four Initiatives Driving ECM Innovation
By Scott Craig, Vice President of Product Marketing, Lexmark...
Technology to Leverage and Enable
By Dave Kipe, SVP, Global Operations, Scholastic Inc.
By Meerah Rajavel, CIO, Forcepoint
AI is the New UI-AI + UX + DesignOps
By Amit Bahree, Executive, Global Technology and Innovation,...
Evolving Role of the CIO - Enabling Business Execution...
By Greg Tacchetti, CIO, State Auto Insurance
Read Also
Disrupt Your Legacy Application Portfolio to Improve Security And...
Why a Credentialing Strategy Must be Part of Your Digital Strategy
The Convergence of IT with the Internet of Things Innovation
It’s On People: The Undeniable Cultural Impact in a Digital...
A Promising Road Ahead for Insurtech
Bolloré Logistics Australia becomes a global leader in the use of...
