Insolubility Classification with Accurate Prediction Probabilities Using a MetaClassifier

Kramer C, Beck B, Clark T (2010)


Publication Status: Published

Publication Type: Journal article

Publication year: 2010

Journal

Publisher: American Chemical Society

Book Volume: 50

Pages Range: 404-414

Journal Issue: 3

DOI: 10.1021/ci900377e

Abstract

Insolubility is a crucial issue in drug design because insoluble compounds are often measured to be inactive although they might be active if they were soluble. We provide and analyze various insolubility classification models based on a recently published data set and compounds measured in-house at Boehringer-Ingelheim. The 2D descriptor sets from pharmacophore fingerprints and MOE and the 3D descriptor sets from ParaSurf and VolSurf were examined in conjunction with support vector machines, Bayesian regularized neural networks, and random forests. We introduce a classifier-fusion strategy, called metaclassifier, which improves upon the best single prediction and at the same time avoids descriptor selection, a potential source of overfitting. The metaclassifier strategy is compared to the simpler fusion strategies of maximum vote and highest probability picking. A prediction accuracy of 72.6% on a three class model is achieved with the metaclassifier, with nearly perfect separation of soluble and insoluble compounds and prediction as good as our calculated maximum possible agreement with experiment.

Authors with CRIS profile

Involved external institutions

How to cite

APA:

Kramer, C., Beck, B., & Clark, T. (2010). Insolubility Classification with Accurate Prediction Probabilities Using a MetaClassifier. Journal of Chemical Information and Modeling, 50(3), 404-414. https://dx.doi.org/10.1021/ci900377e

MLA:

Kramer, Christian, Bernd Beck, and Timothy Clark. "Insolubility Classification with Accurate Prediction Probabilities Using a MetaClassifier." Journal of Chemical Information and Modeling 50.3 (2010): 404-414.

BibTeX: Download