Posts by Collection

portfolio

Portfolio item number 1

Short description of portfolio item number 1

Portfolio item number 2

Short description of portfolio item number 2

publications

Identifying domains of applicability of machine learning models for materials science

Published in Nature Communications, 2020

Demonstrates how statistical rule learning enables the discovery of trustworthy input ranges of machine learning models for materials properties.

Recommended citation: C Sutton, M Boley, LM Ghringhelli, M Rupp, J Vreeken, M Scheffler. (2020). "Identifying domains of applicability of machine learning models for materials science." Nature Communications. 11(1),4428.
Download Paper

Better Short than Greedy: Interpretable Models through Optimal Rule Boosting

Published in Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), 2021

Improving the accuracy / comprehensibility trade-off of additive rule ensembles via exactly optimising the gradient boosting objective for conjunctive rules.

Recommended citation: M Boley, S Teshuva, P Le Bodic, G Webb. (2021). "Better Short than Greedy: Interpretable Models through Optimal Rule Boosting." SDM.
Download Paper

Interpretable Machine Learning Models for Phase Prediction in Polymerization-Induced Self-Assembly

Published in Journal of Chemical Information and Modeling, 2023

Prodives interpetable models to predict the morphological outcome of polymerization-induced self-assemblies with a performance that suffices to reduce time-consuming experimentations by practitioners.

Recommended citation: Y Lu, D Yalcin, PJ Pigram, LD Blackman, M Boley. "Interpretable machine learning models for phase prediction in polymerization-induced self-assembly." Journal of Chemical Information and Modeling 63, no. 11 (2023): 3288-3306.
Download Paper

Bayes beats cross validation: fast and accurate ridge regression via expectation maximization

Published in Proceedings of the 36th International Conference on Neural Information Processing Systems (NeurIPS), 2023

Presents a novel method for tuning the regularization hyper-parameter, λ, of a ridge regression that is faster to compute than leave-one-out cross-validation (LOOCV) while yielding estimates of the regression parameters of equal, or particularly in the setting of sparse covariates, superior quality to those obtained by minimising the LOOCV risk

Recommended citation: SY Tew, M Boley, DF Schmidt. (2023). "Bayes beats cross validation: fast and accurate ridge regression via expectation maximization." NeurIPS. 36
Download Paper

Orthogonal Gradient Boosting for Simpler Additive Rule Ensembles

Published in International Conference on Artificial Intelligence and Statistics, 2024

Improving the accuracy / comprehensibility trade-off of rule ensembles and other additive models via proper adaption of boosting with weight correction.

Recommended citation: F Yang, P Le Bodic, M Kamp, M Boley. (2024). "Orthogonal Gradient Boosting for Simpler Additive Rule Ensembles." AISTATS.
Download Paper

talks

An Introduction to Subgroup Discovery

Published: September 26, 2018

Subgroup discovery (SGD) is a form of local pattern discovery for labeled data that can help find interpretable descriptors from materials-science data obtained by first-principles calculations. In contrast to global modeling algorithms like kernel ridge regression or artificial neural networks, SGD finds local regions in the input space in which a target property takes on an interesting distribution. These local distributions can potentially reflect interesting scientific phenomena that are not represented in standard machine learning models. In this talk, we go over the conceptual basics of SGD, sketch corresponding search algorithms, and show some exemplary applications to materials-science data obtained by first-principles calculations.

Download Download Slides