Sponsor:
Scientific Data Management Center (SDM) under the DOE program of Scientific Discovery through Advanced Computing (SciDAC)
Project Team Members:
Northwestern University
Argonne National Laboratory
- William Gropp(now UIUC)
- Rob Ross
- Rajeev Thakur
- Rob Latham
Parallel NetCDF
About netCDF:
NetCDF (Network Common Data Form) defines two sets of standards to support the creation, access, and sharing of scientific data.- Application Programming Interface (API) -- Fortran, C, and C++ functions are defined for array-oriented data access and a library provides an implementation of the interface.
- File format -- A self-describable, machine-independent format is defined for representing scientific data.

Parallel netCDF:
We have designed a set of alternative APIs for accessing netCDF files in parallel. Figure 1 compares the data access from multiple processes between using sequential and parallel netCDF. The new APIs incorporate the parallel semantics defined in Message Passing Interfaces (MPI) and provide backward compatability with the original netCDF file format. The goals of this work are- Provide high performance -- We build the parallel netCDF library directly on top of MPI-IO, which guarentees the portability across various platforms and provides performance optimizations like collective I/O.
- Maintain the same/compatible file format -- The files created by our parallel netCDF conforms the format defined by netCDF-3.
- Support large files -- The parallel netCDF support for "CDF-2" formated data. With this format, even 32-bit platforms can create netCDF datasets greater than 2GB in size. Array sizes and file offsets are also extended to 64-bit integers supported by MPI_Offset.
- Support subfiling scheme -- We propose a subfiling scheme that divides a large multi-dimensional global array into smaller subarrays, each saved in a smaller file, named subfile. Since the subfiling scheme decreases the number of processes sharing a file, it can reduce the overhead of file system's data consistency control.
- Minimize the changes of API syntax --
In order for easy code migration from programs using sequential netCDF
to our parallel implementation, we mimic the syntax defined in the
original netCDF with only a few changes for MPI adaptation.
These changes are highlighted as follows.
- All parallel functions are named after originals with prefix
of "ncmpi" in C/C++ and "nfmpi" in Fortran.
int ncmpi_put_vars_uchar(int ncid, int varid, const MPI_Offset start[], const MPI_Offset count[], const MPI_Offset stride[], const unsigned char *up) - An MPI communicator and an MPI_Info object are added to
the argument list of the dataset open/creation function.
They define a collection of processes that are operating on
the netCDF file and to pass I/O hints to the implementation
(e.g. expected access pattern, aggregation information).
An example is
int ncmpi_open(MPI_Comm comm, const char *path, int omode, MPI_Info info, int *ncidp) - The collective and independent I/O modes are created for data mode functions, which correspond to MPI collective and independent I/O operations. The interface differences between them are that all collective data access functions carry an extra suffix "_all" and that the independent I/O mode is enabled between ncmpi_begin_indep_data() and ncmpi_end_indep_data() function calls.
- All parallel functions are named after originals with prefix
of "ncmpi" in C/C++ and "nfmpi" in Fortran.
- Flexible Functionality --
In addition to the support of existing netCDF functionality, another new
set of APIs (called flexible API) are created to adopt MPI derived
datatypes for describing complex memory layout for I/O buffers instead
of using continguous data buffers. An example is
int ncmpi_put_vars_all(int ncid, int varid, const MPI_Offset start[], const MPI_Offset count[], const MPI_Offset stride[], const void *buf, int bufcount, MPI_Datatype datatype)
Related Links
- Unidata's netCDF
- Message Passing Interface Standard
- Parallel netCDF project web page maintained at Argonne National Laboratory, including software download, users documents, etc.
- Mailing list: parallel-netcdf@mcs.anl.gov
- Extreme Linux: Parallel netCDF - an article from Linux Magazine
Publications:
- Jianwei Li, Wei-keng Liao, Alok Choudhary, Robert Ross, Rajeev Thakur, William Gropp, Rob Latham, Andrew Siegel, Brad Gallagher, and Michael Zingale. Parallel netCDF: A Scientific High-Performance I/O Interface. In the Proceedings of Supercomputing Conference, November, 2003.
Users:
- NCAR Community Atmosphere Model (CAM)
using parallel netCDF, ZioLib
Platforms: IBM SP3, SP4, SP5, BlueGene/L, Cray X1E
File systems: GPFS, PVFS2, NFS
Organization: Department of Atmospheric Sciences, National Taiwan University
People: Yu-heng Tseng (yhtseng at as.ntu.edu.tw) - Astrophysical Thermonuclear Flashes (FLASH)
using parallel netCDF, HDF5, C
Platforms: IBM SP, Linux Clusters
Organization: ASCI Flash Center, University of Chicago
People: Brad Gallagher, Katie Antypas - ASPECT, parallel VTK
using parallel netCDF, C
Platforms: Linux Clusters, Cray X
Organization: ORNL
People: Nagiza Samatova - Atmospheric Chemical Transport Model (ACTM)
using parallel netCDF, FORTRAN
Organization: Center for Applied Scientific Computing, LLNL
People: John R. Tannahill - PRogram for Integrated Earth System Modeling (PRISM) Support Initiative
supports netCDF and parallel netCDF within the IO library as part of the OASIS4 coupler
using netCDF, parallel netCDF (FORTRAN APIs)
Platforms: NEC SX, Linux Cluster, SGI and others
Organization: C&C Research Laboratories, NEC Europe Ltd.
Contacts for pnetcdf in PRISM: Reiner Vogelsang and Rene Redler
Contacts for pnetcdf users on NEC SX: Joachim Worringen and Rene Redler - Weather Research and Forecast
(WRF) modeling system
software
using parallel netCDF, FORTRAN
Organization: National Center for Atmospheric Research (NCAR)
People: John Michalakes - WRF-ROMS (Regional Ocean Model System) I/O Module
using parallel netCDF, FORTRAN
Organization: Scientific Data Technologies Group, NCSA
People: MuQun Yang, John Blondin - Portable, Extensible Toolkit for Scientific Computation
(PETSc)
Organization: ANL - The Earth System Modeling Framework (ESMF)
platform: IBM Blue Gene / L
Organization: Scientific Computing Division
National Center for Atmospheric Research
Boulder, CO 80305
People: Nancy Collins, James P. Edwards
Acknowledgements:
We are grateful to the following people who provide valuable comments/discussions to improve our implementation.Yu-Heng Tseng (LBNL) Reiner Vogelsang (Silicon Graphics, Germany), Jon Rhoades (Information Systems & Technology ENSCO, Inc.), Kilburn Building (University Of Manchester), Foucar, James G (Sandia National Lab.), Drake, Richard R (Sandia National Lab.), Eileen Corelli (Senior Scientist, ENSCO Inc.), Roger Ting, Hao Yu, Raimondo Giammanco, John R. Tannahill (Lawrence Livermore Nattional. Lab.), Tyce Mclarty (Lawrence Livermore Nattional. Lab.), Peter Schmitt, Mike Dvorak (LCRC team, MCS ANL)