Computational Tools Development

Computationally generated predictions of toxicity endpoints can inform decisions about testing priorities and sometimes eliminate the need for laboratory testing. ICCVAM agencies are developing tools to predict toxicity endpoints such as cardiotoxicity, carcinogenicity, skin sensitization, and genotoxicity, as well as tools for cross-species extrapolation and application of defined approaches (DAs).

QSAR machine-learning models for NRF2 activation and PPARg inhibition to support prediction of lung toxicity

Identifying and applying mitigation strategies for oxidative stress can reduce adverse health effects from operational stressors and improve USAF aircrew and guardian readiness. NRF2 is a key element of the cellular antioxidant defense system, because it regulates transcription of antioxidant proteins and detoxifying enzymes. Scientists in the USAF Force Health Protection program’s Predictive Risk Team (PRT) developed a computational approach to predict potential activators of NRF2 using structural alerts and machine-learning QSAR modeling (Chushak and Clewell 2024). Once developed, the approach was used to screen a list of approved drugs collected in DrugBank to identify potential novel NRF2 activators. Currently, PRT scientists are working on development of QSAR models to identify inhibitors of PPARg. Chemicals that inhibit PPARg increase the risk of long-term lung damage from inhaled toxins. The developed models will be used together with in vitro data and other mechanistic models to predict whether exposure to chemicals in the operational environment may cause acute or chronic lung injury.

Development of a multitask machine learning model for diverse toxicity data

Machine-learning methods enable data-driven development of predictive models for health-effects screening of novel chemicals. Single-task machine-learning models train on one endpoint and lack transferability to similar endpoints. These models also require large, homogeneous data sets. Health-effects screening needs machine-learning tools that can handle multiple small, noisy data sets. USAF Predictive Risk Team (PRT) is collaborating with a team from Johns Hopkins University Applied Physics Laboratory to investigate a novel machine-learning pipeline for molecular representation learning based on a multitask machine-learning paradigm. ToxCast assays with the same target were consolidated to minimize missing entries, and a machine-learning model was trained simultaneously on multiple tasks from moderate-sized data sets. To predict novel tasks from small data sets, the pretrained multitask model was combined with a dimension-reducing map and a task-specific predictor. The multitask model performed better than single-task models on six of 11 training tasks and six of seven distinct non-training tasks. This novel machine-learning pipeline generates molecular representations, leverages reduced dimensionality for greater efficiency, and combines information on multiple effects, including highly specific ligand binding and nonspecific systemic responses, to provide a more generalizable chemical risk model. The pipeline is being completed and prepared for publication in 2024.

Citizen science project for biosurveillance of blotchy bass syndrome

The public is often aware of and interested in fish and wildlife diseases, particularly those that are highly visible, change the appearance of animals, and affect species of high recreational or commercial value like black bass (Micropterus spp.). “Blotchy bass syndrome” is a term used to describe external hyperpigmentation (melanosis) in black basses. This condition has received increased attention from anglers and resource managers in recent years and is a popular topic of discussion and reporting on angling websites and blogging platforms. Crowdsourced data collection can be used to increase community engagement and buy-in, as well as expand geographical and temporal sampling beyond those provided by state agencies. Recognizing a need to understand the geographical extent, seasonality, and biological threat of blotchy bass syndrome, USGS established crowdsourcing efforts to monitor this condition. Approaches used included traditional solicitation, smartphone applications, and virtual fishing events. Multiple discrete but overlapping efforts were undertaken starting in 2021. In March 2022, state agency partners solicited reports of blotchy bass syndrome via social media, requesting that anglers submit photos and information on catch location. Between June and November 2022, a virtual BioBlitz was conducted using the Angler’s Atlas MyCatch smartphone application, incentivized with prizes. A “Blotchy Bass Bonanza” participatory science effort was launched in July 2022 and encompassed all freshwater waterbodies within the United States and Canada. The Bonanza was reinitiated in March 2023 and will continue through February 2024. Overall, efforts yielded data from 31 states, six Canadian provinces, and some submissions from Mexico, Spain, and South Africa. Anglers submitted a total of 1,077 digital photographs of individual fish with presumptive blotchy bass syndrome for scientific review. USGS valued the 52,399 donated personnel hours by participating anglers at $1,153,000.

ECOSAR: computational tool to predict aquatic toxicity

Ecological Structure Activity Relationships (ECOSAR) is a computerized predictive system that estimates aquatic toxicity. The program estimates a chemical's acute and chronic toxicity to aquatic organisms, such as fish, aquatic invertebrates, and aquatic plants, by using computerized structure–activity relationships (SARs). ECOSAR software is available for free without licensing requirements. Key characteristics of the program include:

  • Grouping of structurally similar organic chemicals with available experimental effect levels that are correlated with physicochemical properties to predict toxicity of new or untested industrial chemicals.
  • Programming of a classification scheme to identify the most representative class for new or untested chemicals.
  • Continuous update of aquatic QSARs based on collected or submitted experimental studies from both public and confidential sources.

ECOSAR version 2.2 was released in March 2022. Updates included two new chemical classes, a module to predict toxicity of cationic polymers, user interface updates, and user input of melting point, octanol–water partition coefficient, and water solubility.

Updates to CompTox Chemicals Dashboard tools

The CompTox Chemicals Dashboard is the primary web-based application that provides access to data and algorithms from the EPA Center for Computational Toxicology and Exposure (CCTE). It is a widely used resource for chemistry, toxicity, and exposure information for over a million chemicals. There were seven updates to the Dashboard released during 2022 and 2023. The next Dashboard update is planned for April 2024. In 2023, the Dashboard had over 21,000 average monthly users and over 11,000 new users. An October 2022 virtual training on the Dashboard gave an overview of Dashboard functions and highlighted new features from the 2022 update. Additional training resources for the Dashboard are available on the EPA NAMs Training webpage.

Tags:
Continued development of httk

To fully characterize the potential human health risk of a substance, data are often needed on that substance’s toxicokinetics. Traditional approaches for obtaining toxicokinetics data use animals, but alternative approaches are being developed to computationally estimate relevant parameters. EPA researchers have developed toxicokinetic models within an R software package called httk to estimate chemical concentrations in humans and support IVIVE. The package currently uses human in vitro data to make predictions about the fate of chemicals in humans, rats, mice, dogs, and rabbits. The latest version of httk, v2.3.0, is now available. This version incorporates new in vitro measures of gut absorption for over 400 chemicals. There is also new in vitro clearance and binding data, new quantitative structure–property relationship (QSPR) predictions including gut absorption, and many other new features. New models are being developed to describe absorption through the skin, exposure to aerosols (clouds of droplets), partial oral absorption, and human pregnancy. More information is available on the EPA httk website.

Generalized Read-Across (GenRA) application

Read-across is a computational technique that uses toxicity data from one or more known (source) chemicals to predict toxicity for another (target) chemical, usually but not always on the basis of structural similarity. EPA’s Generalized Read-Across (GenRA) application is a standalone tool linked to the CompTox Chemicals Dashboard that performs read-across algorithmically, helping researchers to make informed decisions about chemicals with little toxicity data available.

GenRA had two major updates during 2022-2023.

  • In September 2022, a new update allowed users to evaluate the relevance of similar substances using predicted physical property information. It provided predictions of toxicity and bioactivity that users could download for further analysis.
  • Version 3.2 was released in March 2023. This update included data updates and over 30 minor improvements and bug fixes. Improved indexing in this release enabled increased processing speed.

In 2023, EPA presented virtual training on GenRA. Approximately 725 people attended the main session and 215 attended the breakouts, representing a record for attendance at an EPA NAMs training session. Additional training resources for GenRA are available on the EPA NAMs Training webpage.

High-throughput transcriptomics and high-throughput phenotypic profiling

Researchers at EPA are using two high-throughput profiling assays, high-throughput transcriptomics (HTTr) with the human whole transcriptome TempO-Seq assay and high-throughput phenotypic profiling (HTPP) with the Cell Painting assay to characterize the biological activity of chemicals across a variety of human-derived cell types. Computational workflows have been established for determining biological pathway-altering concentrations and phenotype-altering concentrations from these two assay types (Harrill et al. 2021; Nyffeler et al. 2023). The high-dimensionality data have been organized for display and distribution to the public through the CompTox Chemicals Dashboard, under the “Bioactivity, HTTr: Summary” and “Bioactivity, HTPP:Summary” views. HTTr and HTPP results for hundreds of chemicals in a variety of cell types (MCF-7 breast adenocarcinoma, U-2 OS osteosarcoma and the 2-D differentiated HepaRG liver cell model) became available on the Dashboard in 2022 and will be updated in future releases.

Tags:
Continued development of SeqAPASS

EPA’s Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) is a fast online screening tool that allows researchers to extrapolate toxicity information across species. SeqAPASS version 7.0, released in September 2023, allows users to incorporate protein structural evaluations of conservation in the SeqAPASS analysis. Using the integrated Iterative Threading ASSEmbly Refinement tool, users can generate protein structures. They can then use those structures or incorporate additional structures from tools like the Research Collaboratory for Structural Bioinformatics Protein Data Bank and AlphaFold to align them to their chosen species, typically a known sensitive species. From there, users can add evidence based on structural similarity to their sequence similarity data in their chemical susceptibility predictions. This work has been leveraged to support agency decision-making relative to pollinators, the Endangered Species Act, and Endocrine Disruptor Screening Program (EDSP). SeqAPASS version 7.1, which will incorporate data and tool updates, is planned for release in early 2024.

Geospatial modeling approaches to link in vitro data with geographic exposure

Traditional risk assessments based on in vivo animal studies typically use a chemical-by-chemical approach and apical disease endpoints. However, in the real world, individuals are exposed to chemicals from sources that vary over space and time. EPA and NIEHS scientists collaborated to develop a workflow (Eccles et al. 2022) to integrate human exposure data for 41 chemicals in the EPA National Air Toxics Assessment with curated high-throughput screening (cHTS) assays to identify counties where exposure to the local chemical mixture may perturb a common biological target. The workflow used the estimated blood plasma concentration and the concentration–response curve from the cHTS data to determine the chemical-specific effects of the mixture components. Three mixture modeling methods were used to estimate the joint effect from exposure to the chemical mixture on the activity levels, which were geospatially mapped. This workflow demonstrates how new approach methodologies (NAMs) can be used to predict early-stage biological perturbations that can lead to adverse health outcomes that result from exposure to chemical mixtures. As a result, this work will advance mixture risk assessment and other early events in the effects of chemicals.

Computational models for hazard identification of flavor compounds in tobacco products

Flavor chemicals contribute to the appeal and toxicity of tobacco products, and the assortment of flavor chemicals available for use in tobacco products is extensive. However, potential harms from inhaling these substances and their byproducts have not been extensively studied. To help address this data gap, FDA scientists (Goel et al. 2022) used a chemistry-driven computational approach to evaluate flavor chemicals based on intrinsic hazardous structures and reactivity of chemicals. A library of 3,012 unique flavor chemicals was compiled from publicly available information, and a structure-based analysis was done to characterize their (1) physicochemical properties, (2) GHS health hazard classification, (3) structural alerts linked to the chemical’s reactivity, instability, or toxicity, and (4) substructures shared with chemicals characterized by FDA as respiratory toxicants. Computational analysis of the constructed flavor library flagged 638 chemicals with GHS classified respiratory health hazards, 1,079 chemicals with at least one structural alert, and 2297 chemicals with substructural similarity to chemicals on the respiratory toxicant list. A subsequent analysis was performed on a subset of 173 chemicals in the flavor library, from which four general structures with an increased potential for respiratory toxicity were identified. This study indicated that computational methods are efficient tools for hazard identification and understanding structure-toxicity relationship. With appropriate context of use and interpretation, in silico methods may provide scientific evidence to support toxicological evaluations of chemicals in or emitted from tobacco products.

Integrated Chemical Environment tools updates

The Integrated Chemical Environment (ICE), developed by NICEATM, provides data and tools to help develop, assess, and interpret chemical safety tests. Improvements made in the March 2023 version 4.0 update allowed users to query ICE using chemical names and their synonyms, as well as with existing chemical identifier options. Other updates to ICE during 2022 and 2023 added help videos, new Chemical Quick Lists, and application programming interfaces supporting the REST APIs. The updates also increased the utility and versatility of the resource’s tools.

  • An updated Results view for the Search tool improved data navigation and provided query summary visualizations.
  • The March 2023 update of the In Vitro to In Vivo Extrapolation (IVIVE) tool incorporated models from EPA’s httk version 2.2.2, including a new gestational model allowing modeling of maternal and fetal chemical distribution. Users can now upload their own in vitro data for modeling, as well as their own in vivo data for comparison to predictions based on in vitro data. Users can compare predictions to population level exposure predictions from EPA’s SEEM3. The tool’s inhalation model has been updated to allow the input of chemical concentration in parts per million per unit volume.
  • A rebranded "Curated Product Use Explorer" in the Chemical Characterization tool expands and improves upon the former "Consumer Use Explorer." A new Functional Use Explorer allows the graphical distribution of chemical lists across functional use categories.
  • The Curve Surfer tool, which allows users to view and interact with concentration–response curves from curated high-throughput screening (cHTS) data, has new options to view and filter results.
  • Updates to the Physiologically Based Pharmacokinetics (PBPK) tool incorporated models from the EPA’s httk version 2.2.2, including the new gestational model. The inhalation model has been updated to allow the input of chemical concentration in parts per million per unit volume. The tool output download files were also updated to include predicted half life and area-under-curve values.
  • The Chemical Quest tool has been fully implemented and updated to allow users to identify chemicals in the ICE database having similar structures to a query chemical, as well as new options to filter results. Users can now perform similarity searches based on user-defined chemical lists.
Open (Quantitative) Structure–activity/property Relationship App (OPERA) updates

The Open (Quantitative) Structure–activity/property Relationship App (OPERA) is a free and open-source/open-data suite of QSAR models developed to support a range of research and regulatory purposes. In addition to physicochemical and environmental fate properties, OPERA offers a number of models predicting ADME endpoints that are important to PBPK modeling and in vitro to in vivo extrapolation (IVIVE) studies. All OPERA models were built using curated data sets split into training and test sets and molecular descriptors calculated based on standardized QSAR-ready chemical structures. Modeling adhered to the five principles for QSAR model development adopted by OECD. These principles support development of scientifically valid, high-accuracy models with minimal complexity that support mechanistic interpretation, when possible. For consistency and transparency, OPERA provides a tool for standardizing chemical structures, an estimate of prediction accuracy, an assessment of applicability domain, and incorporation of experimental values when available. Technical and performance details are described in OECD-compliant QSAR Model Reporting Format reports.

Existing OPERA models are updated regularly when new experimental data are available. Version 2.9, released in September 2022, updated a number of physicochemical properties and ADME parameters covering different classes of chemicals including PFAS. OPERA predictions are available through the EPA CompTox Chemicals Dashboard and NICEATM’s Integrated Chemical Environment. The OPERA application can also be downloaded from the NIEHS GitHub repository as a command-line or graphical user interface for Windows and Linux operating systems. To enable broader access, in September 2022 OPERA became available as an extension to OECD’s QSAR Toolbox, a resource provided by OECD and the European Chemicals Agency to support animal-free chemical hazard assessment.

QSAR models of ocular toxicity

NIEHS scientists developed of a set of computational models to predict eye irritation and corrosion (Sedykh et al. 2022). The models were developed using a curated database of in vivo eye irritation studies from the scientific literature and stakeholder-provided data. The database contained over 500 unique substances, including many mixtures, tested at different concentrations. Substances were categorized according to GHS and EPA hazard classifications. Two modeling approaches were used to predict classification of mixtures. A conventional approach generated predictions based on the chemical structure of the most prominent component of the mixture. A mixture-based approach generated predictions by using weighted feature averaging to consider all known components in the mixture. Results suggest that these models are useful for screening compounds for eye irritation potential. Future efforts to increase the models’ utility will focus on expanding their applicability domains and using them in conjunction with other input variables (e.g., in vitro data) to establish defined approachs (DAs) for eye irritation testing.

STopTox: computational tool to predict acute toxicity

The “six-pack” battery of tests uses animals for acute toxicity assessment of chemicals used as pesticides, pharmaceuticals, or in cosmetic products. Endpoints include skin sensitization, skin irritation and corrosion, eye irritation and corrosion, and acute oral toxicity, acute inhalation toxicity, and acute dermal toxicity. To provide an option for replacing or reducing animal use for these endpoints, NICEATM scientists and collaborators created a publicly accessible Systemic and Topical chemical Toxicity (STopTox) web portal, a comprehensive collection of computational models that can predict the toxicity hazard of small organic molecules (Borba et al. 2022). Publicly available data were compiled, curated, and integrated, then used to develop an ensemble of QSAR models for all six endpoints. In addition to high internal accuracy assessed by cross-validation, all models demonstrated an external correct classification rate ranging from 70% to 77%. Scientists and regulators can use the STopTox portal to identify putative toxicants or nontoxicants in chemical libraries of interest.

DASS App for skin sensitization prediction using defined approaches

In June 2021, OECD issued Guideline 497, Defined Approaches on Skin Sensitisation, the first internationally harmonized guideline to describe a non-animal approach to predict skin sensitization potential. In March 2023, NICEATM launched the DASS App, which computationally applies the defined approaches outlined in Guideline 497 through a user-friendly interface (To et al. 2024). The user uploads data from the test methods used in the defined approach to the web application, which then generates skin sensitization predictions for chemicals of interest. The user selects the analysis variables, and the application dynamically provides feedback about the user’s data set to identify problematic data values. This open-access web-based implementation of internationally harmonized regulatory guidelines for an important public health endpoint is designed to support broad user uptake and consistent, reproducible application.

SARA-ICE model for identification of skin sensitizers

NICEATM is collaborating with consumer products company Unilever to test and further develop their Skin Allergy Risk Assessment (SARA) predictive model (Reynolds et al. 2019) using data from NICEATM’s Integrated Chemical Environment (ICE) resource (SARA-ICE). SARA-ICE is a computational model that uses a variety of input data to estimate a probability that a chemical will cause an allergic skin reaction in humans. It improves upon other similar models by providing a point-of-departure for quantitative risk assessment applications. The model uses publicly available data on 443 chemicals from the ICE database and Unilever SARA and Cosmetics Europe databases and has been applied in several case studies focused on different chemical classes. The SARA-ICE model is under evaluation for inclusion in OECD Test Guideline 497 as a defined approach (DA) for derivation of points-of-departure. One project undertaken during 2022 and 2023 explored the application of SARA-ICE to a diverse set of chemicals nominated by multiple U.S. federal agencies for testing in in vitro skin sensitization assays. The study showed that for this challenging chemical set the SARA-ICE model performs as well as or better than other skin sensitization DAs that are already accepted for regulatory use, and has the advantage of providing a point-of-departure for quantitative risk assessment applications. In a second project, the SARA-ICE model was then applied to provide point-of-departure estimates and hazard classification for six isothiazolinones, a group of broad-spectrum preservatives. This case study demonstrated that the SARA-ICE model can accurately categorize skin sensitization hazard and potency using in vitro and in vivo data inputs and provide quantitative estimates of human potency that include uncertainty. This work is described in a publication being prepared for submission in 2024. This model is an important tool to assess the probability that exposure to a chemical of interest is “low risk” and to support diverse regulatory decision frameworks.

Toxicokinetics tools to connect metabolism and variability

Chemicals that enter the body are metabolized via several pathways. Rates of metabolism can vary across human populations due to genetic variability of metabolic enzymes, such that some populations are more sensitive to effects of parent chemicals or metabolites. Risk assessors apply PBK models to predict the dynamics of tissue concentrations for parent chemicals and their metabolites, but it is difficult to use these models to characterize the effects of enzymatic pathway-related variability. NIEHS scientists developed a generalized workflow to incorporate pathway-related variability for specific metabolic enzymes across human populations into PBK models. The workflow includes metabolite structures generated using SimulationsPlus ADMET Predictor®, PBK models from EPA’s httk package, estimates of interindividual enzyme variability from European Food Safety Authority reports, and parameter predictions from OPERA (v2.8). Parent chemical dynamics were simulated following initial exposure, and the amount of parent metabolized was scaled by percent yield to provide an intravenous time series for metabolite models. Ranges of parent and metabolite concentrations were estimated by Monte Carlo sampling of enzymatic variability in intrinsic clearance. A case study to demonstrate the utility of the workflow used 10 parent chemicals and their metabolites, and efforts are ongoing to incorporate additional chemicals. In quantifying the range of tissue concentrations resulting from metabolic pathway variability, this work facilitates a more health-protective risk assessment for susceptible population groups. The case study was described in an oral presentation (Hull et al.) at the 12th World Congress on Alternatives and Animal Use in the Life Sciences.

Updates to ChemMaps

Access to visualization tools to navigate chemical space has become more important due to the increasing size and diversity of publicly accessible compendiums of high-throughput screening (HTS) and other descriptor and effects data. Construction of such tools relies on complex projection techniques using molecular descriptors. However, application of these techniques requires advanced programming skills that are beyond the capabilities of many stakeholders. Inspired by the Google Maps application, NICEATM developed the ChemMaps.com webserver to easily navigate chemical space. The first version of ChemMaps.com was limited to exploration of drugs and drug candidates. ChemMaps.com v2.0, released in 2022, added data on approximately one million environmental chemicals from the EPA DSSTox inventory (Borrel et al. 2023). ChemMaps.com v2.0 incorporates mapping to HTS assay data from the U.S. federal Tox21 research collaboration, which includes results from approximately 2,000 assays tested on up to 10,000 chemicals. Users can visualize chemical activity both by assay and target directly on the map and compare chemical spaces occupied by active and inactive chemicals. ChemMaps.com v2.0 also has new navigation options, including an on-the-fly distance measurement between two chemicals selected on the 3D map and a map screenshot button.

Tags:
Computational models for cardiotoxicity via hERG inhibition

Cardiovascular disease is the leading cause of death for people of most ethnicities in the United States. The hERG potassium channel plays a pivotal role in cardiac rhythm regulation, and drug molecules and environmental chemicals can potentially induce cardiotoxicity via hERG inhibition. An evaluation of the effect of environmental chemicals on hERG channel function can help inform the potential public health risks of these compounds. NICEATM and NCATS scientists employed several machine-learning approaches to develop QSAR prediction models for the assessment of hERG inhibition for drug-like and environmental chemicals screened in the Tox21 federal research program (Krishna et al. 2022). The data and scripts used to generate the hERG prediction models are provided in an open-access format as key in vitro and in silico tools that can be applied in a translational toxicology pipeline for drug development and environmental chemical screening.

Novel artificial intelligence models to predict carcinogenicity

Carcinogenesis is a multistep process in which healthy cells acquire properties that allow them to form tumors or malignant cancers. The concept of key characteristics of carcinogens has been developed to describe 10 properties that are shared by viruses and chemicals that induce human cancers. QSAR models that rely on structural or physicochemical properties to predict carcinogenesis potential endpoints usually perform poorly, likely because they lack sufficient information on the complex mechanisms involved in carcinogenicity. NICEATM scientists and collaborators combined a novel imputation profile QSAR modeling approach with modern machine learning to analyze data on 10,000 Tox21/ToxCast chemicals and 2,000 in vitro assay endpoints associated with key characteristics of carcinogens. Because limited experimental data were available, data gaps were filled by imputing assay results for the Tox21/ToxCast inventory using structural and physicochemical properties and novel artificial intelligence modeling. Various machine-learning approaches including a multitask deep learning model were applied to predict each chemical’s likelihood of inducing cancer based on the imputed in vitro data. Results included output metrics on the quality of imputation, defined by grouping of assays, and performance computed per chemical. Work is ongoing to validate the prediction model results against literature data, develop confidence scores for the imputation modeling, and map assay data to the key characteristics of carcinogens.

PBPK modeling to predict chemical distribution in brain and adipose tissues

PBPK modeling is used to facilitate decision-making in drug discovery and Risk Assessment. PBPK models are based on various assumptions and simplifications to make them computationally tractable. Most existing high-throughput, open-source PBPK models predict chemical concentrations in major body compartments such as the liver, kidney, and gut. However, estimates for additional organs require specialized models. As an example, for neurotoxicity evaluations, chemical concentrations in the brain depend upon the activity of the blood-brain barrier. Incorporating the blood-brain barrier in a PBPK model and evaluating whether a chemical can cross this barrier is an important step in assessing the potential neurotoxicity of the chemical. Another limitation of existing open-source PBPK models is that they often do not include an explicit adipose tissue compartment. Adipose tissue plays a critical role in toxicokinetics by acting as a storage compartment for lipophilic chemicals and a source of continuous internal exposure as the chemical is released.

To better estimate chemical concentrations in these two toxicologically relevant compartments, NIEHS and EPA scientists and collaborators added brain and adipose tissue compartments to the existing generic PBPK model from EPA’s httk R package (v2.2.2). Concentration–time profiles generated by the model for both hydrophilic and lipophilic chemicals were compared with in vivo data and also with predictions from commercial models. The alignment between the model's predictions against predictions from both commercial models and experimental data indicated that the PBPK model is robust and may be applicable to various aspects of drug development. The project is described in an abstract (Unnikrishnan et al.) accepted for a poster presentation at the 2024 annual meeting of the Society of Toxicology.

Interpretable chemical grouping using an automated KNIME workflow

With the increased availability of chemical data in public databases, innovative techniques and algorithms have emerged for the analysis, exploration, visualization, and extraction of information from these data. One such technique is chemical grouping, where chemicals with common characteristics are categorized into distinct groups based on physicochemical properties, use, biological activity, or a combination. However, existing tools for chemical grouping often require specialized programming skills or the use of commercial software packages. To address these challenges, NIEHS scientists developed a user-friendly chemical grouping workflow implemented in KNIME, a free open-source low/no-code and data analytics platform. The workflow serves as an all-encompassing tool, expertly incorporating a range of processes such as molecular descriptor calculation, feature selection, dimensionality reduction, hyperparameter search, and incorporates supervised and unsupervised machine learning methods, enabling effective chemical grouping and visualization of results. The workflow also has tools for interpretation, identifying key molecular descriptors for the chemical groups, and using natural language summaries to clarify the rationale behind these groupings. The workflow was designed to run seamlessly in both the KNIME local desktop version and KNIME Server WebPortal as a web application. It incorporates interactive interfaces and guides to assist users in a step-by-step manner. The workflow is being implemented as part of the Modeling and Visualization (MoVIZ) Pipeline, which is described in an abstract (Moreira-Filho et al.) accepted for a poster presentation at the 2024 Society of Toxicology meeting.

Derivation of an adverse outcome pathway linking VEGF and cardiotoxicity

Dysregulation of VEGF and its receptor VEGFR contributes to the development of atherosclerosis and cardiovascular disease. This makes the VEGF pathway a potential target for cardiovascular risk assessment of pharmaceuticals and environmental chemicals. AOPs represent a logical sequence of biological responses that contribute to toxicity phenomena and are useful in informing chemical risk assessments. The advent of high-throughput screening (HTS) has made available large-scale in vitro bioassay data that provides mechanistic information that can help assess chemical toxicity and identify AOP molecular initiating events. This in turn can enable the development of human-relevant new approach methodologies (NAMs) for assessing toxicity without the need for extensive animal experimentation. NIEHS scientists applied AOP frameworks to gain a better understanding of the relationship between VEGFR signaling and the development of atherosclerosis. A data-driven approach was developed to find environmental chemicals linked to the bioactivity of the VEGF signaling pathway, and to investigate their links to other regulatory proteins like estrogen receptor alpha and endpoints like atherosclerosis. ToxCast, Tox21, and PubChem data were evaluated to obtain bioprofiles of 4,165 compounds with bioactivity in assays targeting different VEGFR. An AOP hypothesis was developed by coupling the mechanistic relationships highlighted by HTS data with literature review findings. These linked estrogen, serotonin, and vasopressin receptor targets with VEGFR activity mediated by several endocrine-disrupting chemicals, such as bisphenols, triclosan, dichlorodiphenyltrichloroethane, and polychlorinated biphenyls. Structure-based clustering was performed on relevant bioactive chemicals to evaluate potential molecular initiating events and analyze associations with use-case classes. Computational toxicology profiling of in vitro HTS bioassay facilitates the development of mechanism-driven AOPs and associated chemical perturbants to better understand the link between environmental chemical exposures and potential adverse cardiovascular outcomes. A paper describing this project is in preparation for submission in 2024.

OrbiTox: a computational translational discovery platform for data mining and read-across

Visualization tools to navigate chemical space have become more important due to the increasing size and diversity of publicly accessible compendiums of high-throughput screening (HTS) and other effects data. OrbiTox uniquely addresses this need by offering an interactive and immersive 3D environment for visualization of millions of chemicals and their known or predicted activity against gene targets along with available animal study data. By organizing activity in “data domains” as concentric circles, the tool facilitates translational discovery by inferring knowledge from connections across multiple data domains. OrbiTox has a rich, user-friendly interface, offering almost instantly refreshing visualizations along with extensive gap-filling capabilities. The repository contains 37 QSAR models for Tox21 assays at 10 uM and 100 uM thresholds, 13 Ames mutagenicity models (including ten bacterial strain-specific assays), and five cardiotoxicity assays. These models use Saagar molecular fingerprints (Sedykh et al. 2021), which provide chemistry-backed reasoning for each prediction. Based on the structural features and motifs responsible for a prediction, the user can hypothesize mechanistic steps for a given chemical or identify favorable or unfavorable chemotypes for a desired property profile and/or prioritize experimental testing. The January 2024 release will include major architecture and feature updates, with QSAR reports generated as PDFs, an additional 29 Saagar features, and the addition of QSAR models for Ames mutagenicity and cardiotoxicity. The June 2024 release will add 304 ToxCast assays, six cardiotoxicity models, carcinogenicity models, and metabolic similarity evaluations for read-across.