Large-Scale Modeling of Multispecies Acute Toxicity Endpoints Using Consensus of Multitask Deep Learning Methods

Scientists from NCATS, NIEHS, and NCI and collaborators developed computational methods to predict chemical activity for 59 acute systemic toxicity endpoints across multiple species, including 36 endpoints for which computational models had not been previously developed. Data used to develop the models were collected and curated from the ChemIDplus database for acute systemic toxicity and represents the largest publicly available such data set, covering over 80,000 compounds. These data were used for developing multiple single- and multitask models utilizing random forest, deep neural networks, convolutional, and graph convolutional neural network approaches. The paper describing the project (Jain et al. 2021) also reports the consensus models based on different multitask approaches. The curated data set and the developed models have been made publicly available to support regulatory and research applications.