Sponsor:
Advanced Scientific Computing Research (ASCR) under the U.S. Department of Energy Office of Science (Office of Science)
Project Team Members:
Northwestern University
The HDF Group
- Quincey Koziol
- Gerd Herber
Argonne National Laboratory
North Carolina State University
- Nagiza Samatova
- Sriram Lakshminarasimhan
Damsel - A Data Model Storage Library for Exascale Science
Overview
The goal of Damsel project is to enable Exascale computational science aplications to interact conveniently and efficiently with storage through abstractions that match their data models. We are pursuing four major activities:- constructing a unified, high-level data model that maps naturally onto a set of data model motifs used in a representative set of high-performing computational science applicatons;
- developing a data model storage library, called Damsel, that supports the unified data model, provides efficient storage data layouts, incorporates optimizations to enable exascale operation, and is tolerant to failures;
- assessing the performance of this approach through the construction of new I/O benchmarks or the use of existing I/O benchmarks for each of the data model motifs; and
- productizing Damsel and working with computational scientists to encourage adoption of this library by the scientific community.
Data Model
- A set of new data models, inspired by MOAB/ITAPS
- Support for unstructured and structured mesh
- Entity, blocks
- Grouping relationships
- Entity sets, with parent/child relations
- Variables and sttributes
- sparse, dense tags
- Describing physical domain
- Units, dimensions
Usecase: FLASH
FLASH is a modular, parallel multi-physics application, developed at University of Chicago. FLASH uses a stuctured adaptive-mesh refinement (AMR) grid, i.e. the problem domain is hierarchically partitioned into blocks of equal sizes (in array elements). Each block in AMR tree is a 2D/3D mesh on a node/leaf. These blocks are ordered in the Morton space filling curve, also shown in Figure 1. A block's info includes it's tree level, parent/children, neighbors, coordinates, bounding box. Block cells store the solution data. Mapping to Damsel data model
- Blocks map to entity sets.
- Parent/children, neighbors map to each block, since they are fixed size.
- Level, node type, coordinates, bounding box also map to each block.
- Cells map to quads/hexes or to vertices.
- Solution data (physical variables) maps to tags on quad/hex or vertex.
- Contiguity maintained in blocks and cells due to their numbering.
