Air Force Office of Scientific Research (AFOSR), under the Department of Defense (DOD), under award number FA9550-12-1-0458; and National Institute of Standards and Technology (NIST), under Award No. 70NANB14H012.

Project Team Members:

Northwestern University

University of Michigan-Ann Arbor

Georgia Institute of Technology

Northwestern University - EECS Dept.

Data Centered Materials Knowledge Discovery

This page contains a general overview of projects related to materials informatics, the application of data mining technologies for accelerated and enhanced materials knowledge discovery. Several lines of projects developed to solve specific materials modeling problems can be found in each project's page in Project Outline below.


Data mining for materials discovery is concerned with representing materials science problems into a statistical framework, and learning models that describe observations about the processing, structure, and property of materials. The extraction of microstructure-property relationships resides at the basis of nearly all cutting-edge applications of Material Science and Engineering, whose goals are to develop advanced materials for industrial and military purposes, using experimental and computational methodologies. The massive amount of experimental and simulation data produced by modern characterization instruments and computational platforms introduce many challenges in terms of scalability, data storage, complexity, high dimensionality, interpretation, and retrieval. This makes it imperative to employ advanced methods for efficient data storage, retrieval, and analysis, thereby providing opportunities in the scope of high performance data mining for materials informatics.

Large-scale materials databases provide unprecedented opportunities for both supervised (e.g. classification, regression) and unsupervised learning (e.g. clustering, feature learning) in the field of data mining. The use of advanced modeling techniques with various data mining optimization and validation methodologies will allow us to identify strong predictor variables for the outcome of interest (here, a microstructure or a property of a material), and to construct a model for predicting that outcome. This requires advanced data mining techniques for knowledge discovery, which are to combine multiple predictor variables into a predictive model based on supervised data (with known labels/outcomes), and can be used to predict the labels of future test instances.


The project aims to create break-through concepts and methodologies for elucidating the microstructure-properties link to enable materials design by brining together cutting-edge theories and techniques from materials science, mathematics and information science. Three grand challenges have been identified, around which our research efforts are built:

  1. We aim to establish a standardized methodology, grounded in sound mathematics, for acquiring, storing, analyzing, modeling, and querying "beyond 3-D" materials data, taking full account of the potential sparsity of such data as well as the associated uncertainties and variabilities;
  2. We aim to employ advanced stochastic/probabilistic models that allow not only for the description of the "average" microstructure, but also for the inclusion of rare events (large deviations), and to set up the proper data structures to enable such a stochastic description; and
  3. We aim to employ advanced data mining approaches, constrained by accurate mathematical models and accounting for variability, to instantiate large numbers of digital microstructures to search for an optimal microstructure and its process path, to achieve a desired property combination.

Project Outline

The problems we take on to solve in materials knowledge discovery center around addressing the following issues:


Software Download

Softwares developed for materials discovery are mostly problem specific and require domain data that are not publically available. That said, we strive to make components in the pipeline well modulized wherever we can, and in the meanwhile extend them to be suitable for general use. A series of general purpose softwares derived from aforementioned projects can be accessed below.


This work is supported by AFOSR (Air Force Office of Scientific Research), Department of Defense (DOD) under Award No. FA9550-12-1-0458; and by National Institute of Standards and Technology (NIST), under Award No. 70NANB14H012.

Northwestern University EECS Home | McCormick Home | Northwestern Home | Calendar: Plan-It Purple
© 2011 Robert R. McCormick School of Engineering and Applied Science, Northwestern University
"Tech": 2145 Sheridan Rd, Tech L359, Evanston IL 60208-3118  |  Phone: (847) 491-5410  |  Fax: (847) 491-4455
"Ford": 2133 Sheridan Rd, Ford Building, Rm 3-320, Evanston, IL 60208  |  Fax: (847) 491-5258
Email Director

Last Updated: $LastChangedDate: 2016-11-29 23:59:16 -0600 (Tue, 29 Nov 2016) $