Robert R. McCormick School of Engineering and Applied Science Electrical Engineering and Computer Science Department Center for Ultra-scale Computing and Information Security at Northwestern University
Project Team Members:
Northwestern University





Northwestern University - EECS Dept.



High-Performance Data Management, Access, and Storage Techniques for Tera-scale Scientific Applications



Astro3d visualization output

Table of Contents:

Objectives:

To develop a scalable high-performance data management system that will provide support for data management, query capability and high-performance accesses to large datasets stored in hierarchical storage systems (HSS). This data management system will provide the flexibility of databases for indexing, searches, management of objects, creating and keeping histories and trails of data accesses, while providing high-performance access methods and optimizations (pooled striping, pre-fetching, caching, collective I/O) for accessing large-scale data objects found in scientific computations.

Problem Description:

Management, storage, efficient access and analysis of 100's of GBs to 100's of TBs of data, that is likely to be generated and/or used in various phases of large-scale scientific experiments and simulations, such as those in ASCI application domains, are extremely challenging tasks. Current data management and analysis techniques do not measure up to the challenges posed by such large-scale requirements in term of performance, scalability, ease of use and interfaces. Tera-scale computing requires newer models and approaches to solving the problems in storing, retrieving, managing, sharing, visualizing, organizing and analyzing data at such a massive scale. Also, since problem solving in scientific applications requires collaboration and usage of distributed resources, the problem of data management and storage is further exacerbated.

In order to understand multiTeraFLOPs class simulations, users will need to be able to plan complex interrelated sets of supercomputing runs, track and catalog the data and meta data associated with each run, perform statistical and comparative analysis covering multiple runs and explore multivariate time-series data. All these works has to be done before the programming implementation of users' applications. Especially, in order to perform I/O operations efficiently, users are required to understand the underlying storage architectures and file layouts in storage systems. Such requirements present challenging I/O intensive problems and an application programmer may be overwhelmed if required to solve these problems.

Methodology:

Our approach tries to combine the advantages of file systems and databases, while avoiding their respective disadvantages. It provides a user-friendly programming environment which allows easy application development, code reuse, and portability; at the same time, it extracts high I/O performance from the underlying parallel I/O architecture by employing advanced I/O optimization techniques like data sieving and collective I/O. It achieves these goals by using an active meta-data management system (MDMS) that interacts with the parallel application in question as well as with the underlying hierarchical storage environment. The proposed programming environment has three key components:
  1. user applications;
  2. meta-data management system; and
  3. hierarchical storage system.
These three components can exist in the same site or can be fully distributed across distant sites. For example, as part of our experiments we run a parallel volume rendering application on the SP-2 at Argonne National Lab that interacts with the MDMS located at Northwestern University and accesses its data files (currently) using TCP/IP stored on the HPSS (High Performance Storage System) installed at San Diego Supercomputer Center (SDSC). The experimental configuration is depicted in Figure 1.

Figure 1. Three-tiered architecture. Each site can be located distributedly and connect with each other remotely through TCP/IP.

Normally, application programmers have to deal with the storage systems directly, shown as the two ovals, User Applications and Storage System, in Figure 1. Programmers are required to understand the interfaces of storage systems in which the files reside and the associated meta data that describes the internal structures of those files, such as file locations, names of datasets in each file, data structures, types, storage patterns, etc. The information is most likely written in separate text-format files, e.g. README, by the authors who created those data files. Sometimes, programmers even have to inquire additional information from the authors. Traditionally, these work can only be done by the users prior to programming their applications.

In this project, we add the third oval, a meta-data management system, into the figure which acts as an active middle-ware between users' applications and storage systems. This meta-data management system provides users a uniform interface to access resource and information in a heterogeneous computing environment. Through the user interfaces, users can inquire the meta data associated to the desired datasets or files. The inquired meta data not only can be used to make programming decisions at developing time but also can be used to determine condition branches at program's run time. In addition, the interface provides high-performance I/O subroutines that internally perform aggressive I/O operations in term of parallel collective/non-collective I/O and data pre-fetching in the hierarchical storage devices. The design and implementation of this high-performance meta-data management system focus on three aspects:

Project Milestones:

Publications:

  1. X. Shen and A. Choudhary. "A Distributed Multi-Storage I/O for Data Intensive Scientific Computing", in Journal of Parallel Computing, Volume 29, Issues 11-12, pp. 1623-1643, November-December 2003
  2. X. Shen, A. Choudhary, C. Matarazzo and P. Sinha. "A Multi-Storage Resource Architecture and I/O Performance Prediction for Scientific Computing", in Journal of Cluster Computing, Volume 6, Issue 3, pp. 189-200, July 2003.
  3. X. Shen, W. Liao, A. Choudhary, G. Memik, and M. Kandemir. "A High Performance Application Data Environment for Large-Scale Scientific Computations", in IEEE Transaction on Parallel and Distributed System , Volume 14, Number 12, pp. 1262-1274, December 2003.
  4. Y. Liu, W. Liao, and A. Choudhary. "Design and Evaluation of a Parallel HOP Clustering Algorithm for Cosmological Simulation", in the Proceedings of International Parallel and Distributed Parallel Processing (IPDPS), Nice, France, April 2003.
  5. X. Shen, and A. Choudhary. "MS-I/O: A Distributed Multi-Storage I/O System", in IEEE International Symposium on Cluster Computing and the Grid (CCGrid), Berlin, Germany, May, 2002.
  6. G. Memik, M. Kandemir, and A. Choudhary. "Design and evaluation of smart-disk cluster for DSS commercial workloads", in Journal of Parallel and Distributed Computing, Special Issue on Cluster and Network-Based Computing, Volume 61, Issue 11, pp. 1633-1664, 2001.
  7. X. Shen, and A. Choudhary. "DPFS: A Distributed Parallel File System", in IEEE 30th International Conference on Parallel Processing (ICPP) , Valencia, Spain, September, 2001.
  8. X. Shen, W. Liao, and A. Choudhary. "An Integrated Graphical User Interface For High Performance Distributed Computing" in Proc. International Database Engineering and Applications Symposium (IDEAS), Grenoble, France, July, 2001.
  9. J. No, R. Thakur, D. Kaushik, L. Freitag, and A. Choudhary. "A Scientific Data Management System for Irregular Applications", in Proc. of the Eighth International Workshop on Solving Irregular Problems in Parallel (Irregular 2001), April 2001
  10. X. Shen, W. Liao, and A. Choudhary. "Remote I/O Optimization and Evaluation for Tertiary Storage Systems through Storage Resource Broker", in IASTED Applied Informatics, Innsbruck, Austria, February, 2001.
  11. W. Liao, X. Shen, and A. Choudhary "Meta-Data Management System for High-Performance Large-Scale Scientific Data Access", in the 7th International Conference on High Performance Computing, Bangalore, India, December 17-20, 2000
  12. J. No, R. Thakur, and A. Choudhary, "Integrating Parallel File I/O and Database Support for High-Performance Scientific Data Management", in Proc. of SC2000: High Performance Networking and Computing, November 2000.
  13. X. Shen and A. Choudhary. "A Distributed Multi-Storage Resource Architecture and I/O Performance Prediction for Scientific Computing", in International Symposium on High Performance Distributed Computing, Pittsburgh, Pennsylvania, August 1-4, 2000.
  14. X. Shen, W. Liao, A. Choudhary, G. Memik, M. Kandemir, S. More, G. Thiruvathukal, and A. Singh. "A Novel Application Development Environment for Large-Scale Scientific Computations", in International Conference on Supercomputing, Santa Fe, New Mexico, May, 2000.
  15. X. Shen, G. Thiruvathukal, W. Liao, and A. Choudhary. "A Java Graphical User Interface for Large-Scale Scientific Computations in Heterogeneous Systems", in HPC-ASIA, May, 2000.
  16. A. Choudhary, M. Kandemir, J. No, G. Memik, X. Shen, W. Liao, H. Nagesh, S. More, V. Taylor, R. Thakur, and R. Stevens. "Data Management for Large-Scale Scientific Computations in High Performance Distributed Systems", in Cluster Computing: the Journal of Networks, Software Tools and Applications, Volume 3, Issue 1, pp. 45-60, 2000.
  17. G. Memik, M. Kandemir, and A. Choudhary. "APRIL: A Run-Time Library for Tape Resident Data", in the 8th NASA Goddard Space Flight Center Conference on Mass Storage Systems and Technologies and 17th IEEE Symposium on Mass Storage Systems , March, 2000.
  18. A. Choudhary, M. Kandemir, H. Nagesh, J. No, X. Shen, V. Taylor, S. More, and R. Thakur. "Data Management for Large-Scale Scientific Computations in High Performance Distributed Systems", in High-Performance Distributed Computing Conference'99, San Diego, CA, August, 1999.
  19. A. Choudhary and M. Kandemir. "System-Level Metadata for High-Performance Data management", in IEEE Metadata Conference, April, 1999.

Project Team Members:

Prof. Alok Choudhary (P.I.)
Prof. Valerie Taylor (Co. P.I.)
Prof. George Thiruvathukal
Prof. Wei-keng Liao
Graduate Student Xiaohui Shen
Graduate Student Gokhan Memik
Undergraduate Student Arti Singh
Northwestern University EECS Home | McCormick Home | Northwestern Home | Calendar: Plan-It Purple
© 2011 Robert R. McCormick School of Engineering and Applied Science, Northwestern University
"Tech": 2145 Sheridan Rd, Tech L359, Evanston IL 60208-3118  |  Phone: (847) 491-5410  |  Fax: (847) 491-4455
"Ford": 2133 Sheridan Rd, Ford Building, Rm 3-320, Evanston, IL 60208  |  Fax: (847) 491-5258
Email Director

Last Updated: $LastChangedDate: 2015-02-19 15:02:26 -0600 (Thu, 19 Feb 2015) $