Indiana University

Skip to:

  1. Search
  2. Breadcrumb Navigation
  3. Content
  4. Browse by Topic
  5. Services & Resources
  6. Additional Resources
  7. Multimedia News

Last modified: Monday, April 18, 2011

IU, University of Illinois launch HathiTrust Research Center for computational access to archives

April 18, 2011

BLOOMINGTON, Ind. and URBANA, Il. -- The world's great libraries and archives use specially designed rooms, cases and vaults to protect and organize books and records so they may continue to be studied and understood for years to come. As an ever-increasing amount of our cultural record is created and stored digitally, we face the new challenge of how to ensure our digital cultural archives are easily accessible -- both to contemporary researchers and those working long in the future.


A new collaborative research center launched jointly by Indiana University and the University of Illinois, along with the HathiTrust Digital Repository, will help to meet this challenge by developing cutting-edge software tools and cyberinfrastructure to enable advanced computational access to the growing digital record of human knowledge.

The HathiTrust Research Center (HTRC) will enable open access for nonprofit and educational users to published works in the public domain (as well as limited access to works under copyright) stored within HathiTrust, an extensive collaborative digital library of more than 8 million volumes and 2 billion pages of archived material maintained by major research institutions and libraries worldwide.

Leveraging data storage infrastructure at Indiana University and computational resources at the University of Illinois at Urbana-Champaign, the HTRC will provision a secure computational and data environment for scholars to perform research using the HathiTrust Digital Repository. The center will break new ground in the areas of text mining and non-consumptive research, allowing scholars to fully utilize content of the HathiTrust Library while preventing intellectual property misuse within the confines of current U.S. copyright law.

"The HTRC partnership combines expertise and resources of two of the nation's foremost research universities to build a first-of-its-kind center for advanced analysis of the HathiTrust corpus," says John Wilkin, executive director of HathiTrust. "Prior to this collaboration, computational analysis over the vast HathiTrust collection has been difficult. HTRC promises to ease computational analysis of the texts and promote new algorithmic development and discovery."

Contributing partners in HTRC at Indiana University include: the Pervasive Technology Institute -- Data to Insight Center (D2I); Office of the Vice-President for Information Technology; Office of the Vice Provost for Research; and IU Libraries. At University of Illinois contributing partners include: the Illinois Center for Computing in the Humanities, Arts, and Social Science (I-CHASS); Illinois Informatics Institute; and National Center for Supercomputing Applications (NCSA). Each of the founding partners in the HTRC team brings extensive and highly regarded expertise in the areas of applied cyberinfrastructure, digital humanities, computer science, informatics, library science and virtual organizations.

"In sponsoring this important research utility, Indiana University and the University of Illinois are furthering the important collaborative research activities that have been enabled through the creation of the HathiTrust, an organization now over 50 member libraries strong," said Brenda Johnson, Ruth Lilly Dean of Libraries at Indiana University. "Providing outreach and engagement for the digitally engaged scholar will be of prime importance to the HTRC and will further enrich collection building and enhance access methodologies for the HathiTrust Digital Library."

"NCSA is excited to play a role in this project. We believe that bringing information technology and cyberinfrastructure, as well as the expertise of information technology specialists, to bear on challenges in the humanities will yield important advances," said NCSA Director Thom Dunning.

The HTRC project will be led by an executive committee that includes principal investigators Beth Plale, D2I director and professor in the School of Informatics and Computing at Indiana University; and Scott Poole, I-CHASS director and professor in the Department of Communication; along with Robert McDonald, Indiana University associate dean of libraries; and John Unsworth, interim director of the Illinois Informatics Institute and dean of the Graduate School of Library and Information Science at the University of Illinois.

About HathiTrust

The HathiTrust was created in 2008 through a partnership with the 12-university consortium known as the Committee on Institutional Cooperation (CIC), the 11 university libraries of the University of California system and the University of Virginia. Since that time HathiTrust has grown to encompass the research libraries of more than 50 institutions. HathiTrust was built to enable libraries a means to archive and provide access to their digital content, whether scanned volumes, special collections or born-digital materials. Preserving materials for the long term has long been a mission and driving force of leading research libraries. Their collections, accumulated over centuries, represent a treasury of cultural heritage and investment in the broad public good of promoting scholarship and advancing knowledge. The representation of these resources in digital form provides expanded opportunities for innovative use in research, teaching and learning, but must be done with careful attention to effective solutions for the curation and long-term preservation of digital assets. For more information about the HathiTrust visit

About the Data to Insight Center at Indiana University

The Data to Insight Center (D2I) undertakes research to harness the vast stores of digital data being produced by modern computational resources, allowing scientists and companies to make better use of these data and find the important meaning that lies within them. D2I creates tools and visualizations for working with very large data sets, develops methods to ensure data provenance (quality and authenticity), and builds methods for listing and discovering data sets. D2I is part of the Indiana University Pervasive Technology Institute (PTI). Funded by a $15 million grant from the Lilly Endowment Inc., PTI is dedicated to the development and delivery of innovative information technology and policy to advance research, education, industry, and society. For more information visit

About the Institute for Computing in Humanities, Arts and Social Science (I-CHASS) at the University of Illinois

The Institute for Computing in Humanities, Arts, and Social Science (I-CHASS) at the University of Illinois at Urbana-Champaign charts new ground in high performance computing and the humanities, arts, and social sciences by creating both learning environments and spaces for digital discovery. With an emphasis on identifying, creating, and adapting computational tools that accelerates research and education, I-CHASS engages visionary scholars from across the globe to demonstrate approaches that interface next-generation interdisciplinary research with high-performance computing. For more information, see