机器学习::加速MOFs设计与开发

MS杨站长 2024-01-23 10:29:27

随着机器学习(ML)的发展,材料系统的设计和发展经历了一个加速过程。然而,将ML应用于材料系统设计的主要挑战之一,在于找到合适的设计表示。大多数材料设计应用程序是利用定量(或数值)设计变量来表示材料系统。在很多情况下,这些定量描述符(特征)需要专业知识或数据分析,才能找到最合适的描述符。

Fig. 1 The qualitative representation and construction of metal-organic framework materials.

另一方面,尽管大多数定性(或分类)变量(如化学元素、化学成分)比定量变量更容易获得,但在自动材料设计中直接将定性变量作为设计变量的一部分是一个挑战。

Fig. 2 The design space of fof topology used in the study.

金属有机框架(MOFs)就是这类材料系统的一个例子。MOFs是一类多孔结晶材料,广泛用于气体储存、气体分离和催化。由于其高度可调性,MOFs被视为解决不同应用问题的潜在方案,例如二氧化碳(CO2)的捕集和分离。然而,由于MOF构建块及其组合方式的多样性,候选材料数量级过高。

Fig. 3 The Latent Variable Gaussian Process-Multi Objective Batch Bayesian Optimization (LVGP-MOBBO) framework.

因此,实验所需的时间和资源太高,人们已经开始使用机器学习来加速材料系统的设计和开发。但现有的方法通常依赖于大量的数据集和高维物理描述符来表示材料设计空间。这些机器学习模型既耗时,泛化性又不强,通常不能迁移到不同的设计目标上。

Fig. 4 The LVGP-BO results for the Reduced Design Space (RDS) exploration.

来自美国西北大学机械系的Yigitcan Comlek等,提出了一套潜在变量高斯过程多目标批量贝叶斯优化(LVGP-MOBBO)框架,以直接从构建材料的构建块中快速设计优越的MOFs。

Fig. 5 The latent variables obtained from the Reduced Design Space (RDS) study.

他们使用了已有的定性MOFs建筑块信息,构建了一个可解释的LVGP模型,在MOBBO的辅助下,自适应地引导CO2捕获和分离性能较好的MOFs。

Fig. 6 Structure – property relationship of the Entire Design Space (EDS) and Reduced Design Space (RDS) datasets.

他们通过整合批量贝叶斯优化,无描述符的LVGP也可以有效地扩展到具有大量级别的应用。通过LVGP预测具有看不见构建块的MOFs的特性是一个很有前途的研究领域。

Fig. 7 The distribution of the largest cavity diameters of 1001 MOFs in the Reduced Design Space (RDS) for different building blocks.

该框架的一个有趣的应用是将涉及到通过自主实验研究进行材料设计和开发。由于在LVGP-MOBBO中没有人为干预,而且实验输入可以是定性和定量的,在这里提出的方法可以帮助研究人员有效地指导实验。

Fig. 8 Performance of the LVGP-MOBBO on the Entire Design Space (EDS).

Editorial Summary

Machine learning accelerates the design and development of MOFs

With recent advances in machine learning (ML), material system design and development has undergone rapid acceleration. However, one of the major challenges in applying ML to material system design lies in finding the appropriate design representations. Most material design applications take advantage of quantitative (or numerical) design variables to represent material systems. In most cases, these quantitative descriptors (features) require either expert knowledge or data analysis to find the most appropriate ones. On the other hand, although most qualitative (or categorical) variables (e.g., chemical elements, chemical compositions) are more accessible than quantitative variables, it is challenging to directly include qualitative variables as a part of the design variables in automated materials design. Metal-organic frameworks (MOFs) are an example of such materials systems.

Fig. 9 Latent variable plots after the LVGP-MOBBO campaign on the Entire Design Space (EDS).

MOFs are a of porous crystalline materials that have been used extensively for gas storage, gas separation, and catalysis. Because of their highly tunable nature, MOFs have been looked at as a potential solution for different applications such as CO2 capture and separation. However, the versatility and different possible combinations of the MOF building blocks lead to millions of candidates. Due to the high experimental cost, both in time and resources, machine learning has been used to accelerate material system design and development. However, the existing approaches usually rely on large data sets and high-dimensional physical descriptors to represent the material design space. These processes can be both time consuming and property specific, meaning that the ML models and descriptors are often not transferable to different design objectives.

Fig. 10 Comparative study with Random Forest and LVGP-MOBBO.

Yigitcan Comlek et al. from the Department of Mechanical Engineering, Northwestern University, presented a Latent Variable Gaussian Process Multi-Objective Batch Bayesian Optimization (LVGP-MOBBO) framework to perform rapid design of superior MOFs directly from the building blocks that construct the material. They took advantage of the readily available qualitative building block information that is used to construct the MOFs and built an interpretable LVGP surrogate model that cooperates with MOBBO to adaptively lead towards promising MOF candidates for CO2 capture and separation. With the integration of batch BO, descriptor-free LVGP can be effectively extended to applications with substantial number of levels. To predict the properties of MOFs with unseen building blocks through LVGP is a promising area of research. The interesting application of this framework would involve performing materials design and development through autonomous experimentation studies. As there is no human intervention in LVGP-MOBBO, and the experimental inputs can be both qualitative and quantitative, the method presented in this work can help researchers guide their experiments efficiently.

原文Abstract及其翻译

Rapid design of top-performing metal-organic frameworks with qualitative representations of building blocks (快速设计具有定性表示构建块的性能最佳的金属有机框架)

Yigitcan Comlek, Thang Duc Pham, Randall Q. Snurr & Wei Chen

Abstract

Data-driven materials design often encounters challenges where systems possess qualitative (categorical) information. Specifically, representing Metal-organic frameworks (MOFs) through different building blocks poses a challenge for designers to incorporate qualitative information into design optimization, and leads to a combinatorial challenge, with large number of MOFs that could be explored. In this work, we integrated Latent Variable Gaussian Process (LVGP) and Multi-Objective Batch-Bayesian Optimization (MOBBO) to identify top-performing MOFs adaptively, autonomously, and efficiently. We showcased that our method (i) requires no specific physical descriptors and only uses building blocks that construct the MOFs for global optimization through qualitative representations, (ii) is application and property independent, and (iii) provides an interpretable model of building blocks with physical justification. By searching only ~1% of the design space, LVGP-MOBBO identified all MOFs on the Pareto front and 97% of the 50 top-performing designs for the CO2 working capacity and CO2/N2 selectivity properties.

摘要

定性(分类)信息的系统通常会给数据驱动材料设计带来挑战。特别地,通过不同的构建块来表示金属有机框架(MOFs)给设计者将定性信息纳入设计优化带来了挑战,同时也带来了一个组合型的挑战,即设计者们能够探索的MOFs太多。在本工作中,我们集成了隐变量高斯过程(LVGP)和多目标批量-贝叶斯优化(MOBBO),以自适应、自主和高效地识别性能最好的MOFs。

我们展示了我们的方法(i)不需要特定的物理描述符,只使用构建块来构建MOFs,通过定性表示进行全局优化,(ii)应用和属性独立,(iii)提供了一个具有物理证明的可解释构建块模型。通过仅搜索约1%的设计空间,LVGP-MOBBO识别了Pareto前沿的所有MOFs,在目前50种CO2吸收效率与CO2/N2选择性能最好的设计中搜索出了97%的样本。

0 阅读:2

MS杨站长

简介:德国马普所科研民工,13年材料理论计算模拟经验!