Genomics research is one of the largest drivers in generating Big Data for science, with the potential to equal, if not surpass, the data output of the particle physics community. Like physicists, university-based life-science researchers must collaborate with counterparts and access data repositories across the nation and globe.
The National Center for Biotechnology Information (NCBI) in Maryland at the National Library of Medicine (NLM) hosts almost 25 Petabytes of valuable genomics data and makes it available to a global community of scientists, including Dr. Alex Feltus, a researcher at Clemson University, who is collaborating on diverse projects ranging from crop development to mapping molecular pathways in pathogens to understanding cancer. Dr. Feltus and NCBI are leveraging Internet2 infrastructure, including new Advanced Layer 2 Service (AL2S) high-speed connections, and perfSONAR to substantially accelerate genomic big data transfers and transform researcher collaboration. Deploying this new technology in a research context requires specific expertise. Dr. Feltus’ collaboration with facilitators and network engineers has been an imperative to produce meaningful and significant outcomes. Leveraging the Internet2 community network and human connections, along with key NSF funding, the collaborative efforts of Clemson and NLM have essentially changed Feltus’ workflow to achieve scientific results at a larger scale and with massive gains in data mobility across networks.
The significance of the speed up (which is looking more like 75-100X by the way) is that I can: A) SCALE UP EXPERIMENTS by using more input data since I can get the data quickly; and B) MINIMIZE LOCAL STORAGE of huge files because they enter workflows and then get deleted. I can just download them again if I screwed up my experiment. -Alex Feltus, Clemson University Faculty