https://ntp.niehs.nih.gov/go/niceatm-cheminfo

Computational Models of Chemical Activity

Using structural data to generate activity predictions for new or poorly characterized chemicals can help researchers and regulators make decisions about further testing needs.

NICEATM provides support for continued development of the Open (Quantitative) Structure-activity/property Relationship App (OPERA). OPERA is a free and open-source/open-data suite of QSAR models providing predictions on physicochemical properties, environmental fate, ADME, and toxicity endpoints.

Open-source quantitative structure-property relationship tools

NICEATM and collaborators at EPA developed OPERA, which uses molecular structures to predict the physicochemical features for a wide range of substances (Mansouri et al. 2018).

QSAR models to predict physicochemical properties

OPERA was updated to include QSAR models to predict physicochemical properties such as lipid-aqueous dissociation coefficient (logD) and acid dissociation constant (pKa) (Mansouri et al. 2019).

QSAR models to predict IVIVE parameters

To support an open-source workflow for in vitro to in vivo extrapolation (IVIVE), NICEATM developed QSAR models, implemented in OPERA, to predict properties, such as human plasma fraction unbound and hepatic intrinsic clearance, that affect how substances behave in biological systems. The property predictions are used in the IVIVE tool in the NICEATM Integrated Chemical Environment

QSAR models to screen for skin sensitizers

NICEATM and collaborators at the University of North Carolina-Chapel Hill developed QSAR models of human data that can either be combined with or used instead of animal data to screen for potential skin sensitizers (Alves et al. 2016; Borba et al. 2021).

Tools to predict estrogen and androgen receptor pathway activity

NICEATM and EPA collaborators created open-source versions of published computational models to predict activity relevant to endocrine disruption (Judson et al. 2015; Kleinstreuer et al. 2017). Predictions from these models are available through the NICEATM Integrated Chemical Environment.

NICEATM and EPA collaborated globally to leverage the expertise of the world-wide modeling community to predict estrogenic activity (CERAPP: Collaborative Estrogen Receptor Activity Prediction Project; Mansouri et al. 2016) and anti-androgenic activity (CoMPARA: Collaborative Modeling Project for Androgen Receptor Activity; Mansouri et al. 2020). Predictions from these models are available through the NICEATM Integrated Chemical Environment and will be available through the EPA Chemistry Dashboard. The consensus models from CERAPP and CoMPARA were also added to the standalone OPERA application.

International collaborations to predict acute toxicity

NICEATM and EPA ran a global collaboration to leverage the expertise of the world-wide modeling community to predict acute oral systemic toxicity (CATMoS: Collaborative Acute Toxicity Modeling Suite; Kleinstreuer et al. 2018; Mansouri et al. 2021). Predictions from CATMoS are available through the NICEATM Integrated Chemical Environment and will be available through the EPA Chemistry Dashboard. The consensus models from CATMoS were also added to the standalone OPERA application. NICEATM is currently organizing a similar project to develop models to predict acute inhalation toxicity.

Statistical models for classification of eye irritants

NICEATM developed statistical models that could potentially be used to classify chemicals as eye corrosives, irritants, or non-irritants according to EPA and GHS hazard classification endpoints (Sedykh et al. 2022). Results suggest that these models are useful for screening substances for eye irritation potential.

QSAR prediction of assay interference for specific technology platforms

NICEATM and NIEHS scientists developed InterPred, a web tool to predict chemical autofluorescence and luminescence interference (Borrel et al. 2021).

QSAR prediction of acute systemic and topical toxicity

NICEATM and collaborators at the University of North Carolina-Chapel Hill, Duke University, and the Federal University of Goias in Brazil developed STopTox, a comprehensive collection of computational models that can predict the toxicity hazard of small organic molecules (Borba et al. 2022).

Structure-based models to predict cardiotoxicity

NICEATM and collaborators at NCATS developed QSAR prediction models for effects on the hERG potassium channel (Krishna et al. 2022). The hERG channel plays an important role in cardiac rhythm regulation.

Web tools for modeling, visualization, and data extraction

NICEATM and collaborators are developing the Modeling and Visualization Pipeline (MoVIZ) using the free and open-source KNIME analytics platform. MoVIZ aims to democratize computational methods through intuitive, well-documented, and user-friendly graphical interfaces. When completed, MoVIZ will include components for data access, curation, modeling, and visualization, supported by step-by-step guides. The pipeline will support both local execution and web-based access via the NIEHS KNIME server web portal. MoVIZ will include KNIME workflows for chemical grouping and data curation.

  • The MoVIZ Chemical Grouping Workflow (Moreira-Filho et al. 2024) enables chemical grouping through supervised and unsupervised machine learning. It includes modules for descriptor calculation, feature selection, dimensionality reduction, and interpretation of groupings. Designed to work on desktops and via the NIEHS KNIME server web portal, it allows both beginner and expert users to perform chemical grouping analyses.
  • The MoVIZ LLM-Based Data Extraction Workflow leverages large language models (LLMs) and document parsers to extract structured data from scientific publications and general PDF files. It offers two execution modes—text mode and image mode—to handle diverse file formats with up to 98.5% accuracy. It is available on GitHub, the KNIME Community Hub, and the NIEHS KNIME server web portal. The workflow will be described in detail in a paper to be published in 2025.
Development of additional QSAR models

NICEATM and collaborators are developing or applying QSAR models to predict:

  • Acute inhalation toxicity.
  • Substrate selectivity for glucuronidation.
  • Dermal irritation.
  • Fish acute toxicity (LC50).
  • Cancer hallmarks and key characteristics of carcinogens.
  • Mechanisms of developmental toxicity.
  • Mechanisms of carcinogenicity and drug-induced liver injury.