Indiana University
Office of Communications and Marketing

Background information about IU's new supercomputer

Oct. 17, 2001

NOTE: For more technical information about the new IU supercomputer, see http://www.indiana.edu/~uits/cpo/ibmsp/about.html.

About the IU Teraflop Supercomputer

The Indiana University Teraflop SP supercomputer can perform a trillion mathematical operations per second. It will be tightly connected to IU's massive data storage system that is capable of holding hundreds of trillions of bytes of data. But this processing capability is meaningful only when harnessed to do important work. The Teraflop SP will enhance existing research programs and make possible new research in many disciplines, including the medical sciences. As the human genome becomes increasingly well understood, one of the critical issues will be identifying what are called single nucleotide polymorphisms (SNPs) -- places in the genome where different people display different genetic information. Here scientists expect to find some of the keys to understanding human genetic diseases. The IU Teraflop SP will be used to search for SNPs and to analyze clinical data en route to unraveling the relationships between genes and cancer. The Teraflop SP will enable types of research that are currently neither practical nor possible. It will be the first system at IU capable of holding in memory all of the information required to simulate clusters of hundreds of thousands of stars -- a capability that will enable new understanding of the formation and evolution of our own Milky Way galaxy. Geologists will use the SP in their reach for understanding the finely detailed structure of the earth's crust. Analyzing the evolutionary relationships of hundreds of organisms -- work that might take five years on a personal computer -- can be done in a matter of weeks on the Teraflop SP. In all instances, the SP does more than speed the pace of research. The Teraflop SP will open new doors to scientific discovery in dozens of fields. IU researchers and UITS staff will demonstrate a few of the research applications of IU's Teraflop SP, described below.

N-body gravitational simulation: Star cluster modeling

Astronomers would like to understand the way star clusters form and develop. The equations of motion have been known since Newton's time, but they cannot be solved analytically; so we turn to simulations. A globular cluster with 100,000 stars has 10 billion gravitational interactions to compute at each time step. Calculations similar to this will be enabled by the Teraflop SP to model the complex dynamics within evolving start clusters. Such simulations provide a basis for understanding the formation and evolution of the extraordinary X-ray emitting binary systems that are now being found in large numbers in the cores of globular star clusters by the Earth-orbiting Chandra X-ray telescope. IU Astronomy Department researchers are working with colleagues at Harvard University to study the properties of these X-ray binaries, which contain highly collapsed white dwarf and neutron stars.

fastDNAml: Inferring evolutionary relationships

Many important questions in science and medicine involve answering the question: What are the evolutionary relationships among a group of organisms? It is now possible to infer the evolutionary relationships among organisms based on DNA sequences. However, this process takes tremendous amounts of computation. For example, analyzing the evolutionary relationships of 100 animals on one microcomputer might take as long as five years. Indiana University is collaborating with researchers at other institutions to create a parallel (supercomputer) version of fastDNAml, a popular package for inferring evolutionary relationships. fastDNAml has been used at Indiana University to better understand the evolutionary origins of the Microsporidia, an economically and medically important group of parasites. Better understanding of the evolutionary origins of these disease-causing organisms will shed light on better methods for treating the diseases that Microsporidia cause.

The IU massive data storage system: Making terabytes accessible from desktops

Delivered using the High Performance Storage System (HPSS) software, the massive data storage system (MDSS) is intended for users with projects that need large-scale, near-line storage. Since HPSS works best with large files, optimal applications will store data in files that are typically larger than 50MB, including large collections of high resolution areal maps, digitized art work, sound and animation files, astronomical images and the like. IU is the only large academic site in the world that provides a diverse user base with ubiquitous (and easy) access to nearly 200 terabytes (1 terabyte = 1,000 gigabytes = 1,000,000 megabytes) of data storage capacity in the central, tape-based, massive data storage system. We present two demonstrations of this system: user access via the Web of the data stored in their MDSS area, and a video clip stored in the MDSS in a large file displayed in real time on a Windows desktop machine using the convenient Distributed File System (DFS) front-end.

XMView/CMView: Scalable molecular visualization

The Indiana University Molecular Structure Center's (IUMSC) online database of molecular structures is a valuable resource for chemistry researchers and students from all over the world. Thanks to a collaboration with the UITS Advanced Visualization Laboratory, chemists now have a tool for studying these molecular structures -- XMView/CMView, a scalable visualization system for molecular chemistry. This application offers a pair of programs, each with similar functionality and a common file format, providing researchers the convenience and flexibility of working at their desktops, as well as the power, visual complexity, and ease of interaction offered by working in the CAVE. Chemists can download data from the IUMSC's Web-accessible database, grow crystal structures interactively, perform precise measurements on those structures, and then move data files to an immersive, virtual environment for more detailed examination.

IBM digital displays and advanced imaging

Indiana University has acquired one of IBM's newest, advanced digital displays, the IBM T220, a 22.2-inch diagonal with 9.2-million pixels, yielding a resolution of 204 DPI (dots per inch). This resolution is so fine that some detail is visible only with a magnifying glass. Two types of images are demonstrated: satellite images and medical images. Some satellite image data are collected in such a fashion that they require considerable computing power to convert raw data to an actual image. The display shows several images of the Bloomington area, collected via satellite and converted into images on the IBM SP supercomputer. Also on display are biomedical images that demonstrate the utility these advanced displays offer to biomedical researchers and clinicians.

BioSifter: An intelligent biological information management system

Advances in biomedical research have led to tremendous growth in the amounts of data, in a variety of formats, that biomedical researchers must store and manipulate. This has created an even more critical need for innovative information management and knowledge discovery tools that can sift through these volumes of data. Intelligent software systems that can seamlessly integrate information resources and data analysis tools will enable biomedical researchers to integrate existing information in the various subtasks of their research activities. In this research we present a general model for an information management system that is adaptable and scalable, followed by a detailed design and implementation of one component of the model. The prototype, called BioSifter, was applied to problems in the area of bioinformatics. The results show BioSifter as a powerful tool with which biological researchers can automatically retrieve relevant text documents from biological literature based on their interest profile.

(Craig Stewart, IU, 812-855-4240, stewart@indiana.edu or Theo Chisholm, IBM, 914-766-1180, theoc@us.ibm.com)


Return to the OCM Home Page