Workshop Introduction

In a recent McKinsey & Co. study, it is claimed that the healthcare industry in U.S. alone could potentially save $450 billion a year with the help of advanced analytics, but healthcare organizations continue to struggle with managing and leveraging the vast stores of data they are building up. By 2011, U.S. healthcare organizations had generated 150 exabytes -- that's 150 billion gigabytes -- of data. Leading providers such as Kaiser Permanente alone might have as much as 44 petabytes of patient data just from its electronic health record (EHR) system, or 4,400 times the amount of information held at the Library of Congress. Add to this the insurance sector, the independent laboratories and individual health records, and the number is increasingly astounding both in terms of volume and in terms of variety of data sources.


There is a common theme emerging in the healthcare industry - big data enables unprecedented opportunity for aggregation and integration leading to cost effective and improved patient care. It is the aim of this workshop to bring together big data practitioners, researchers, students, clinicians, health IT experts, and data scientists to share ideas on how to improve the state of our healthcare systems by delivering on the promise of big data infrastructure investments.


We observe the same theme of data driven science in biological and biomedical research also. For example a next-generation sequencing experiment may easily generate terra-bytes of raw data. In biological imaging and biomedical imaging, large volumes of data are generated. How to store, achieve, index, manage, learn, mine, and visualize those data is clearly a challenge to the research community.