IU partnership results in faster Trinity RNA sequencing software
Key software used to study gene expression now runs four times faster, thanks to performance improvements put in place by a team from the Indiana University Pervasive Technology Institute (PTI), the Broad Institute of MIT and Harvard and Technische Universität Dresden.
The timesaving breakthroughs will allow bioinformaticians and biologists who study RNA sequences to analyze more data in a shorter amount of time. This will speed the understanding of biological processes in fields as diverse as ecology, evolution, biofuels and medicine.
Robert Henschel and Richard D. LeDuc, of PTI and IU's National Center for Genome Analysis Support (NCGAS), announced the findings July 17 at the XSEDE12 conference in Chicago. Henschel and LeDuc, along with partners from the Broad Institute and the Center for Information Services and High Performance Computing (ZIH) at Technische Universität Dresden, teamed up to announce this advance in a fast-growing area of computational biology.
The software, known as Trinity, was developed by researchers at the Broad Institute and Hebrew University. It produces high-quality RNA sequence assemblies used by scientists studying gene expression. These RNA sequence assemblies allow scientists to know which genes are active within a living creature. Trinity is especially useful for studying organisms without a complete genome sequence, such as agricultural pests, ecological indicator species and human parasites.
The software has long been considered a leader in the field, but it needed some finetuning.
"IU research technologists strive to deliver tools and services that accelerate discoveries for scientists all over the world. By collaborating with our counterparts at Broad and ZIH, we were able to do just that with Trinity. This is just one example of how the various centers affiliated with PTI -- such as NCGAS -- improve the capabilities of scientists at home and abroad," said Craig Stewart, executive director of IU's Pervasive Technology Institute and principal investigator of the National Science Foundation grant that funds NCGAS.
"In the past, Trinity was a high-quality tool but the run time was too long," said Henschel. "Now with our performance improvements, it runs as fast as the competition -- if not faster -- and still produces superior quality sequence assemblies."
The partners first used standard high performance computing techniques to improve the software's speed. Specifically, this involved building Trinity with an optimizing compiler for the Intel® Xeon® architecture and using optimizing compiler flags. In addition, the team properly configured the application to take full advantage of multicore, multisocket compute nodes in today's clusters.
Next, the team finetuned each part of the Trinity package to improve the overall scalability of the application. They used Vampir performance analysis tools, developed at ZIH, to gain insights into the software's performance. The optimizations included improving and parallelizing input/output, simplifying data structures for better performance and optimizing parallel regions in the application.
Henschel is hopeful that IU's work with Trinity will continue. "We are working on establishing a continued collaboration between IU, Broad and ZIH to further optimize Trinity," said Henschel. "We hope these performance improvements are just the beginning of a longer term relationship that will continue to benefit biological research."
About XSEDE12 and XSEDE
XSEDE12 is the first conference of the Extreme Science and Engineering Discovery Environment (XSEDE), a national collaboration that provides cyberinfrastructure services and resources to support scientific discovery in fields such as medicine, engineering, earthquake science, epidemiology, genomics, astronomy and biology.
XSEDE is funded through a five-year, $121 million National Science Foundation (NSF) grant. For more, see https://www.xsede.org.
About Indiana University Pervasive Technology Institute
The Pervasive Technology Institute is IU's flagship initiative for advanced information technology research, development and delivery in support of research, scholarship and artistic performances. The National Center for Genome Analysis Support (which includes LeDuc) and the Research Technologies division (which includes Robert Henschel) are both Service and CyberInfrastructure Centers affiliated with PTI. For more, see https://pti.iu.edu.
About the Broad Institute of MIT and Harvard
The Eli and Edythe L. Broad Institute of Harvard and MIT was launched to empower creative scientists to transform medicine. The Broad Institute seeks to describe all the molecular components of life and their connections; discover the molecular basis of major human diseases; develop effective new approaches to diagnostics and therapeutics; and disseminate discoveries, tools, methods and data openly to the entire scientific community. For more, see https://www.broadinstitute.org.
About the Center for Information Services and High Performance Computing at Technische Universität Dresden
The Center for Information Services and High Performance Computing (ZIH) at Technische Universität Dresden in Germany supports other departments and institutions in their research and education for all matters related to information technology and computer science. For more, see https://www.tu-dresden.de/zih.
Originally published July 17, 2012.