How Evidence-Based Methodologies Can Help Identify and Reduce Uncertainty in Chemical Risk Assessment

Evidence-based methodology, in particular systematic review, is increasingly being applied in environmental, public, and occupational health to increase the transparency, comprehensiveness, and objectivity of the processes by which existing evidence is gathered, assessed, and synthesized in answering research questions. This development is also changing risk assessment practices and will impact the assessment of uncertainties in the evidence for risks to human health that are posed by exposure to chemicals. The potential of evidence-based methodology for characterizing uncertainties in risk assessment has been widely recognized, while its contribution to uncertainty reduction is yet to be fully elucidated. We therefore present some key aspects of the evidence-based approach to risk assessment, showing how they can contribute to the identification and the assessment of uncertainties. We focus on the pre-specification of an assessment methodology in a protocol, comprehensive search strategies, study selection using predefined eligibility criteria, critical appraisal of individual studies, and an evidence integration and uncertainty characterization process based on certainty of evidence frameworks that are well-established in health care research. We also provide examples of uncertainty in risk assessment and discuss how evidence-based methodology could address those. This perspective, which neither claims to be comprehensive nor complete, is intended to stimulate discussion of the topic and to motivate detailed exploration of how evidence-based methodology contributes to characterization of uncertainties, and how it will lead to uncertainty reduction in the conduct of health risk assessment.

ical summary of what often complex evidence is actually saying in response to a research question, to render processes more transparent, objective and reproducible, and to move away from tradition-based, subjective and sometimes poorly justified decisions, most of which are associated with an unknown level of uncertainty (Eddy, 2005). Evidence-based methodologies are well established in healthcare, but also in other areas such as social, education and environmental sciences, where they serve as the foundation of evidence-based practice and policy decisions. Application and adaptation of evidence-based methodology to the assessment of health risks of chemicals is being pioneered by a steadily growing community (EFSA, 2010;Hoffmann et al., 2017;Rooney et al., 2014;Stephens et al., 2013Stephens et al., , 2016Thayer et al., 2014;Whaley et al., 2016;Woodruff and Sutton, 2014).
Consideration of uncertainties is an integral element of such assessments. By "uncertainty" we mean a lack of confidence in a

Introduction
Risk assessment of chemicals is increasingly stimulating scientific debate and subsequently public interest, especially when discussed controversially. A lack of transparency has been identified as a major reason for such controversies (Schreider et al., 2010). In addition, increased objectivity and scientific rigor in risk assessment methodology have been called for (NRC, 2014). The potential of evidence-based approaches, as exemplified in systematic review methods, to address those shortcomings has been recognized by various institutions, including governmental agencies such as the World Health Organization, the European Food Safety Authority (EFSA), the US Environmental Protection Agency, and other research organizations, e.g., the US National Institute for Environmental Health Sciences and the Evidence-based Toxicology Collaboration. Such approaches have been developed to provide a crit-reflected upon input-related uncertainty and how evidence-based methods can inform its characterization.
We intend here to further stimulate the exploration of the potential of how evidence-based methodology, focusing particularly on the use of systematic review methods as a component of EBT approaches, can contribute to uncertainty assessment. Because uncertainties in risk assessment are manifold, we select those that in our opinion are particularly interesting or instructive and do not claim to present a comprehensive or complete view of whether or how EBT can resolve all uncertainties.

Evidence-based approaches and uncertainty
Guided by the systematic review steps described by Hoffmann et al. (2017), we discuss how evidence-based approaches to specific risk assessment tasks can help identify and make apparent various sources of uncertainty. This may eventually lead to the reduction of important uncertainties relating to risk assessment, and improved qualification and characterization of the confidence in the evidence that supports risk assessment conclusions and subsequent risk management intervention.

Protocol
The pre-specification of the methodology to conduct research is pivotal to ensure transparency of the process and to reduce bias (Whaley et al., 2020a). This is a required step in the systematic review process. In chemical risk assessment, comprehensive documentation of a thorough planning process, issued in the form of a protocol, allows for decisions to be planned in advance of seeing the evidence in detail. This helps reassure stakeholders that the results of an SR are derived from methods, rather than methods being determined ad-hoc that are at risk of being fitted around expectations and opinions. Since some foreknowledge of data is inevitable on the part of domain experts involved in the review, protocols can minimize the risk that bias is introduced, e.g., by approaches favoring conclusions in line with the experts' expectations.
How does this work? An openly accessible protocol that is registered in advance of study conduct allows comparison of the applied methodology with what was originally planned. Examples of systematic review protocols for chemical risk assessment can increasingly be found in the peer-reviewed literature (e.g., Matta et al., 2019;van Luijk et al., 2019). The use of such protocols discourages ad-hoc decision-making that may lead to bias in a review, for example the omission of studies based on their results or changing the data analysis plan in such a way that it generates more favorable or interesting results. Conducting risk assessment using a protocol helps researchers to reflect on the potential bias introduced by a deviation and prompts them to provide a justification for amendments. In this way, uncertainty related to the review methodology can be made transparent, assessed, and possibly be reduced. A lack of transparency in inhalation toxicity reference values has, for example, been identified for chemicals regulated in the EU, leaving sometimes substantial differences in values derived by governmental agencies and industry unexplained (Schenk et al., 2015). More clarity on the study methods prior to conduct of the research may help address this. knowledge claim, primarily due to other informational deficiencies, which are context and time-dependent (EFSA Scientific Committee, 2018b).
In 2018, EFSA provided a general and broad list of uncertainty types relevant to risk assessment, to which we refer here to illustrate our perspective (EFSA Scientific Committee, 2018a). The EFSA guidance identified and grouped common sources of uncertainty into those associated with assessment input and those associated with assessment methodology.
Of the eight specific types of uncertainty related to the inputs (EFSA Scientific Committee, 2018a), ambiguity, i.e., the quality of allowing more than one data interpretation, measurement accuracy and precision, missing studies, missing data within studies, as well as extrapolation uncertainty are addressed here. Sampling uncertainty, assumptions about inputs, and statistical estimates appear less relevant in our context. Extrapolation uncertainty, which subsumes sources of uncertainty such as inter-and intra-species extrapolation, exposure route, and exposure duration, is integral to chemical risk assessment. Depending on the available evidence, the respective uncertainty is typically accounted for in a chemical-specific approach or by default uncertainty (or assessment) factors to determine toxicity reference values of a chemical to be compared to exposure (Dankovic et al., 2015;NRC, 2014).
Of the ten specific uncertainties that EFSA associated with assessment methodology, we address only uncertainties introduced by the methods for processing evidence from the literature and uncertainties associated with expert judgement, especially when integrating frequently heterogeneous and inconsistent evidence (EFSA Scientific Committee, 2018a). The latter type may affect the risk assessment result, such as when it is biased by disregarded evidence which potentially undermines its acceptance.
In addition to the uncertainties related to evidence and assessment methods, uncertainties related to a specific assessment context need to be considered. For example, a lack of transparency and objectivity or the perception of potential conflicts of interest may create further potential uncertainties and impact on stakeholder trust in the assessment process as occurred, for example, in the complex controversies surrounding the carcinogenicity assessment of glyphosate (see, e.g., Kogevinas, 2019, including the rapid response by Kabat, 2019, and analysis by Robinson et al., 2020) and the risk assessment of aspartame (Kass and Lodi, 2020;Millstone and Dawson, 2019), which both provide ample illustration of stakeholder concerns in this area.
The assessment of uncertainties, i.e., their identification, prioritization and qualification or quantification, can be conducted in various manners. Uncertainties related to the evidence are often based on generic assumptions, e.g., as specified in (prescriptive) guidance documents. More chemical-specific uncertainties are being explored by employing comprehensive and systematic approaches that also consider methodological aspects (e.g., Schenk and Johanson, 2011;Beck et al., 2016;Bhat et al., 2017;Wikoff et al., 2020a). The potential of evidence-based approaches for the assessment and reduction of uncertainty has been recognized (EFSA Scientific Committee, 2018b; Hoffmann et al., 2017;Wikoff et al., 2020b;Wolffe et al., 2019). For example, in their proposal of an evidence-based risk assessment framework, Wikoff et al. (2020b)

Selection of the evidence
Any search approach that is more comprehensive and objective than an expert-driven selection of to-be-considered studies will most likely retrieve studies that are not relevant for the risk assessment. If a comprehensive search strategy is conducted, the number of studies ultimately excluded will in most, if not all, cases be substantially higher than the number included (see, e.g., Dorman et al., 2018;Lam et al., 2014). Therefore, a selection process is needed. In systematic reviews, an evidence-based, i.e., transparent, a priori defined and reproducible, two-tier approach using eligibility criteria for inclusion and exclusion of studies is applied. The predefined criteria describe primarily the population and intervention/exposure of the underlying risk assessment question and are applied in a consistent manner. Text mining and machine learning tools are becoming increasingly available to assist the evidence selection (e.g., Marshall and Wallace, 2019;Howard et al., 2020).
In the first tier, studies retrieved through the search are reduced by excluding those that can be unambiguously identified as irrelevant based on their title and/or abstract. In the second tier, the remaining studies are assessed in detail for their relevance based on the full texts. The selection process, often using appropriate webbased software tools, is usually summarized using the PRISMA statement flow chart (Moher et al., 2015), while the details, e.g., the reasons for exclusion, are also documented, so that they are available if needed. The transparency established in this way reduces uncertainty caused by excluding relevant or including irrelevant evidence based on selective choices, which is one of the issues in the glyphosate controversy, where the studies/evidence considered in the three evaluations (IARC, Monsanto, EPA) differed (Benbrook, 2019), resulting in opposing conclusions being drawn. In particular, it provides a rationale why evidence that others might consider relevant was excluded. This opens the opportunity to systematically explore the effect of adjustments to the eligibility criteria on the risk assessment outcome. Ultimately, a transparent selection process increases trust because it insulates an assessment from criticism for being selective (see, e.g., Robinson et al., 2020). In this way, the definition of explicit eligibility criteria, and the screening of potentially relevant studies against those criteria, helps address uncertainties associated with missing studies and the process for dealing with evidence from the literature.

Critical appraisal of the individual studies
Once the relevant evidence has been selected, the next step in a chemical risk assessment is to understand the uncertainty inherent in the individual studies to determine how informative each study is with regard to developing the assessment findings. There are several sources of such uncertainty, comprising the level of detail in the reporting of the research, the internal validity of the study (i.e., the extent to which a study minimizes systematic errors or biases), the external validity of the study (i.e., the extent to which the findings of a study can be generalized to other circumstances) as well as other aspects, such as the appropriateness of statistics, conflict of interest, and study sensitivity (Cooper et al., 2016;Rooney et al., 2016;Samuel et al., 2016). The most fundamental issue is the reporting quality of studies. If the reporting of studies is incom-Registration and publication of the protocol also allows external review of methods to be applied, helping to prevent critical errors in methods before a study is conducted. It informs stakeholders and enables them to provide feedback on methodology, potentially increasing the trust in the evidence and the decisions made based on it, by those affected by its findings. This early involvement of stakeholders potentially increases their engagement and consent with the risk assessment approach and subsequently its results (EFSA, 2015). In this way, the likelihood of post-hoc criticism, which may make the risk assessment appear less trustworthy, can potentially be lowered. In terms of the EFSA uncertainties, the development and publication of protocols prior to conduct of assessment has most impact on the assessment methodology uncertainties related to ambiguity, to the process for dealing with evidence from the literature, and to expert judgement. This is via definition of processes for dealing with evidence, including the use of expert opinion, being made transparent, justified, and improved upon in advance of conducting the assessment (e.g., EFSA et al., 2017).

Search for the evidence
A common criticism of chemical risk assessments is that not all available and pertinent evidence has been identified (Chvátalová, 2019;Deveau et al., 2015;Rudén, 2001;Schenk, 2010). The main reason for missing critical evidence is the way it is searched for, which may be influenced by subjectivity, as well as time and resources available for conducting the risk assessment. In addition, relevant evidence may be hard to find, particularly when not available in readily searchable databases. Such sources are commonly referred to as "gray literature" and include, inter alia, government reports, regulatory databases and conference proceedings. Efforts are underway to bring this evidence into the public domain and incorporate it into systematic reviews, building upon experience with gray literature in healthcare research (Mahood et al., 2014). EFSA considers the uncertainty associated with missing evidence as a methodology issue (EFSA Scientific Committee, 2018b). The potential for it can be limited by the use of systematic search strategies that comprehensively cover relevant literature sources and terms used for retrieving literature from those sources (Whaley et al., 2020b). While literature databases such as PubMed, Web of Science, Embase and BIOSIS previews are standard sources, the identification of relevant sources of gray literature usually requires domain and information retrieval expertise. In this way, the risk of missing relevant evidence, either due to the search approach or, worse, by selective choices, which possibly results in an evidence base that can be biased in either direction, can be reduced (Gusenbauer and Haddaway, 2021). While it can be challenging to implement a comprehensive search, e.g., due to the amount of evidence retrieved, demonstrating that (almost) all potentially relevant evidence has been identified and considered should increase trust in the findings of the risk assessment. Comprehensive search strategies are fundamental for evidence-based approaches and systematic reviews (Hoffmann et al., 2017), which have the additional advantage of being readily reproducible, amendable, and updateable. Such searches help address the input-related uncertainties of missing studies. tal animal studies for the human exposure covered by the risk assessment, or if the mode of action underlying the observed toxicological effects is relevant for humans, e.g., in terms of species differences. In many chemical risk assessment frameworks, such considerations are addressed in a weight-of-evidence (WoE) process when assessing all evidence, as elaborated below.
Critical appraisal of individual studies can help address uncertainty in assessment inputs, especially accuracy and precision of measures and missing data within studies, as well as with assessment methodology-related uncertainty of expert judgement.

Evidence integration
A broad spectrum of additional uncertainties becomes apparent when the evidence for a chemical risk assessment is considered across studies. At this stage, inconsistencies between individual studies, strains, species or cell types, the human relevance of non-human studies, and the characterization of effects in terms of size, precision, severity/adversity, reversibility, and plausibility are assessed (see, e.g., Rhomberg et al., 2013). This process is often termed WoE and used in many chemical risk assessment contexts and regulations. As definitions and descriptions of WoE differ, also the methodology applied and the processes differ (Martin et al., 2018;OECD, 2019). While some WoE approaches are well-characterized, others are less specific and allow for flexibility, which may lead to substantial differences in chemical risk assessments conducted in different regulatory contexts, even when based on the same evidence (Hassauer and Roosen, 2020).
The adoption and adaptation of evidenced-based methodology to chemical risk assessments has, at the level of the body of evidence to be integrated, led to exploring the application of the Grading of Recommendations Assessment, Development and Evaluation (GRADE) working group approach (Cano-Sancho et al., 2019; Koustas et al., 2014;NTP, 2016;Yost et al., 2019). The GRADE working group has developed an approach to rating certainty in the body of evidence, originally developed for health care questions, that considers several quality-determining factors: five potentially downgrading (risk of bias, inconsistency, indirectness, imprecision, and publication bias) and three potentially upgrading (large effect size, plausible confounders likely to reduce effect size, and dose-response gradient) (Guyatt et al., 2008;Schünemann et al., 2013). A pre-defined level of certainty in a body of evidence is then modified by applying each of the factors, resulting in an overall certainty rating that can be considered as a categorical integration of the uncertainties addressed by the various factors.
It has been suggested that these factors also should be applicable to the type of evidence encountered for chemical risk assessments, while acknowledging that additional guidance still needs to be developed (Morgan et al., 2016). Some GRADE factors can be associated with uncertainty sources related to the evidence, e.g., interspecies differences would fall under indirectness, unexplained intra-species differences would be covered by inconsistency, and study quality limitations would at least partly be addressed under risk-of-bias assessment. Others, especially publication bias, i.e., systematic differences between the findings of published and unpublished research, and possibly related biases, such as multiple publication and citation bias, are typically not considered in chemical risk assessment, so that their roles and impact remain largely to plete or unclear, it becomes difficult to assess the other uncertainty sources (Schenk et al., 2015). Poor reporting especially affects the assessment of internal study validity, not least as authors, but also reviewers, are often not aware of study design, conduct and analysis details that relate to internal validity. Contacting authors for unreported information may resolve the issue but is likely to fail in many cases for several reasons, including failing to obtain contact details, poor response rates, time-constraints of the assessment, and incomplete or inappropriate record keeping (see, e.g., supplementary data of Li et al., 2020). In the recent WHO ILO (International Labour Organization) systematic reviews to support the estimation of global burden of disease from occupational environmental exposures, one third of requests for missing data were not met (Pega et al., 2021).
The internal validity of studies is the extent to which the results of a study are subject to systematic error, either over-or underestimating the true effect. Originally developed for randomized controlled trials (RCT) for health care interventions, empirical evidence demonstrating the effects of biases is available (Berkman et al., 2014;Schulz et al., 1995). While it is reasonable to assume that some biases relevant for RCT may also be relevant for the type of studies used in chemical risk assessment, empirical evidence supporting the adaptation is still scarce, as, e.g., observed by Bero et al. (2018) for observational exposure studies. Assessing internal study validity is further complicated by the fact that biases to be considered depend on the types of studies encountered for chemical risk assessment, comprising observational, in vivo and in vitro studies, and by the fact that broader agreement on approaches and tools still needs to emerge (Lynch et al., 2016;Samuel et al., 2016).
While systematic review methodology cannot on its own fix experimental issues with studies, a pre-defined and systematic approach to critical study appraisal results in a consistent and traceable assessment of each included study. In this way, it reduces uncertainty by providing a thorough and transparent quality assessment of all included evidence pieces that supports their interpretation and integration, thus avoiding subjective or inappropriate approaches. In addition, it reduces the likelihood of misleading the risk assessment by missing to identify biases in studies, e.g., when reasons for possible over-or under-estimation in studies go unnoticed. For example, Wikoff et al. (2018) found in a case study on trichloroethylene that the only oral exposure study that showed an effect also had the strongest risk of bias concern. They further stated that this study was considered in regulatory setting of reference values. Assuming that the biases in that study resulted in an overestimation of the effect, the confidence in the risk assessment outcome is decreased by uncertainties introduced by a flawed assessment of the individual study quality. While the interpretation of the evidence in this case study is certainly a matter of expert debate, the systematic assessment approach permits such debate by providing a well-defined and transparent basis for it.
Assessing the external validity of the pieces of evidence supports the elucidation of their relevance to the research question. Depending on the study quality assessment approach and tool, this aspect may be analyzed on the level of the individual study (NTP, 2015; see, e.g., Wikoff et al., 2019) or when considering the body of evidence. Examples for aspects to be considered for external validity are relevance of various administration routes in experimen-sources of uncertainty. In addition, it could contribute to reducing methodological and contextual uncertainties due to different risk assessment outcomes when considering the same evidence, as, e.g., contributed to differences in the setting of an occupational exposure level for n-methyl-2-pyrrolidone in the European Union (RAC and SCOEL, 2016). While the use of GRADE does not eliminate uncertainty, it can help considerably in its characterization of a range of uncertainties across assessment inputs, especially the accuracy and precision of measured data, missing studies, extrapolation uncertainty, the process of dealing with evidence from the literature, and expert judgement.

Conclusions
Application and adaptation of evidence-based methodology, cumulating in systematic reviews, to chemical risk assessment is being pioneered by a steadily growing community. This is starting to change risk assessment practices, including the assessment of uncertainties related to the data. While this impact on uncertainty assessment has been acknowledged (EFSA Scientific Committee, 2018b; Hoffmann et al., 2017;Wikoff et al., 2020b;Wolffe et al., 2019), detailed evaluations of this connection are still to be con-be explored. Likewise, it needs to be explored if all sources of uncertainty considered when integrating the evidence in a chemical risk assessment can be accommodated in the GRADE framework.
Some of these challenges are addressed, for example, in the GRADE Environmental Health Project Group (Morgan et al., 2019), which explores how the concept of biological plausibility links to external validity and indirectness and how GRADE applies in the context of development of adverse outcome pathways (AOP) (De Vries et al., 2021;Whaley et al., 2022), a pragmatic, chemical-agnostic framework to describe our knowledge of causally linked events at different levels of biological organization that lead to adverse health effects. As a WoE approach to assess the causality in AOP, and also chemical-specific mode of action, is often based on modified Bradford-Hill considerations (Meek et al., 2014), a discussion is taking place on how the two approaches relate to each other (De Vries et al., 2021;Hoffmann et al., 2022). It is important to note that the GRADE framework itself has roots in Bradford Hill criteria and has been further developed, operationalized, and validated for healthcare research over the last decades (Schünemann et al., 2011).
Incorporation of a GRADE-like approach in chemical risk assessment would facilitate consistent, transparent and comprehensive assessment of various important and frequently encountered  -93, 605-610. doi:10.1016/j.envint.2016.03.017 Dankovic, D. A., Naumann, B. D., Maier, A. et al. (2015. The ducted. As a first step to further instruct such evaluations, we here present some considerations of how evidence-based methodology, in the form of systematic review methods, links and can contribute to the assessment of uncertainties related to the evidence, the methodology, and the context of a chemical risk assessment. We also provide examples of issues encountered in chemical risk assessment due to uncertainty and lack of confidence and explain how evidence-based methodology can make the sources of uncertainty apparent and reduce, and possibly avoid, the associated uncertainty. Evidence-based methodology holds potential to inform all three types of uncertainty, while methodological and contextual uncertainty can be reduced via the pre-definition of the assessment methodology in a protocol, a comprehensive search strategy, study selection using predefined eligibility criteria, critical appraisal of individual studies, and an evidence integration process adopting and adapting an operationalized framework (Tab. 1).
We note that our perspective is not complete. For example, we did not address the potential role the framing of the risk assessment questions, in particular following the PECO (population, exposure, comparator, outcome) framework, could play (EFSA, 2010;Morgan et al., 2018); rather, this commentary is intended as a starting point to explore the topic in greater detail to identify the opportunities it presents and also the limitations that it may entail. As the community of early adopters continues to work on the development of evidence-based methodology for chemical risk assessment questions, the Evidence-Based Toxicology Collaboration (EBTC) provides a platform for exchange and collaboration to all stakeholders to harmonize approaches and increase impact.