一文分析机器学习的模型

ExMh_zhishexues 2019-07-07 3274

电子说

1.4w人已加入

描述

科学家的主要作用是从数据中提取基础知识。材料科学中机器学习的目标是通过自动识别关键数据之间的关系来获得科学知识的深入理解，从而加速基础科学研究。但如何自动识别关键数据之间的关系尚需深入研究。

来自美国加州理工学院的John M. Gregoire领导的团队证明，通过分析训练的神经网络模型本身，可以加速上述知识获取过程。以BiVO4基光阳极为例，他们使用训练的卷积神经网络预测了该体系的光电化学性能。他们利用高通量实验获得的1379个光阳极样品的组成和拉曼光谱来训练神经网络模型。该模型的梯度能有效地可视化材料参数空间中特定区域的数据规律，以及整个数据集的数据规律。梯度自动分析为材料研究提供了指导，包括如何超越现有数据集的限制，以进一步提高材料性能。这种解释机器学习模型的方法加速了人们对材料科学的认识，并揭示了科学发现的自动化途径。

该文近期发表于npj Computational Materials 5: 34 (2019)，英文标题与摘要如下，点击左下角“阅读原文”可以自由获取论文PDF。

机器学习

Analyzing machine learning models to accelerate generation of fundamental materials insights

Mitsutaro Umehara, Helge S. Stein, Dan Guevarra, Paul F. Newhouse, David A. Boyd & John M. Gregoire

Machine learning for materials science envisions the acceleration of basic science research through automated identification of key data relationships to augment human interpretation and gain scientific understanding. A primary role of scientists is extraction of fundamental knowledge from data, and we demonstrate that this extraction can be accelerated using neural networks via analysis of the trained data model itself rather than its application as a prediction tool. Convolutional neural networks excel at modeling complex data relationships in multi-dimensional parameter spaces, such as that mapped by a combinatorial materials science experiment. Measuring a performance metric in a given materials space provides direct information about (locally) optimal materials but not the underlying materials science that gives rise to the variation in performance. By building a model that predicts performance (in this case photoelectrochemical power generation of a solar fuels photoanode) from materials parameters (in this case composition and Raman signal), subsequent analysis of gradients in the trained model reveals key data relationships that are not readily identified by human inspection or traditional statistical analyses. Human interpretation of these key relationships produces the desired fundamental understanding, demonstrating a framework in which machine learning accelerates data interpretation by leveraging the expertize of the human scientist. We also demonstrate the use of neural network gradient analysis to automate prediction of the directions in parameter space, such as the addition of specific alloying elements, that may increase performance by moving beyond the confines of existing data.

机器学习

打开APP阅读更多精彩内容