https://ntp.niehs.nih.gov/go/n463671

Interpretable chemical grouping using an automated KNIME workflow

With the increased availability of chemical data in public databases, innovative techniques and algorithms have emerged for the analysis, exploration, visualization, and extraction of information from these data. One such technique is chemical grouping, where chemicals with common characteristics are categorized into distinct groups based on physicochemical properties, use, biological activity, or a combination. However, existing tools for chemical grouping often require specialized programming skills or the use of commercial software packages. To address these challenges, NIEHS scientists developed a user-friendly chemical grouping workflow implemented in KNIME, a free open-source low/no-code and data analytics platform. The workflow serves as an all-encompassing tool, expertly incorporating a range of processes such as molecular descriptor calculation, feature selection, dimensionality reduction, hyperparameter search, and incorporates supervised and unsupervised machine learning methods, enabling effective chemical grouping and visualization of results. The workflow also has tools for interpretation, identifying key molecular descriptors for the chemical groups, and using natural language summaries to clarify the rationale behind these groupings. The workflow was designed to run seamlessly in both the KNIME local desktop version and KNIME Server WebPortal as a web application. It incorporates interactive interfaces and guides to assist users in a step-by-step manner. The workflow is being implemented as part of the Modeling and Visualization (MoVIZ) Pipeline, which is described in an abstract (Moreira-Filho et al.) accepted for a poster presentation at the 2024 Society of Toxicology meeting.