https://ntp.niehs.nih.gov/go/n463766

Perspectives on variability and reproducibility of in vivo toxicology studies

Understanding the variability and reproducibility of reference animal data and how it may affect the new approach methodology (NAM) evaluation process is of utmost importance to the development, integration, and implementation of NAMs into regulatory decision-making. To better understand these factors, NICEATM and EPA have conducted multiple retrospective evaluations that have shown substantial variability for several standardized in vivo toxicology test methods, including both single (e.g., Karmaus et al. 2022) and repeat-dose (e.g., Pham et al. 2020) study designs.

NICEATM has undertaken a broader assessment of these evaluations to provide a more realistic context to existing data streams and to help set appropriate expectations for the overall performance of NAMs in the context of existing in vivo reference data. Additional assessments of the validation status of multiple in vivo guideline studies have also been undertaken. A lack of validation can impact the robustness and reproducibility of a method, thus impacting the variability within the method. This work was presented in a poster (Oyetade et al.) at the 12th World Congress on Alternatives and Animal Use in the Life Sciences in 2023 and a paper will be submitted for publication in 2024.

An EPA study estimated benchmarks for NAM performance in predicting organ-level effects in repeat-dose studies of adult animals based on variability in replicate animal studies (Paul Friedman et al. 2023). Treatment-related effect values from the Toxicity Reference database (v2.1) for weight, gross, or histopathological changes in the adrenal gland, liver, kidney, spleen, stomach, and thyroid were used. In brief, findings suggest the following:

  • Variance explained by study metadata was similar for organ and study findings.
  • Organ effects were unlikely in a chronic study if no organ findings were observed in a subchronic study.
  • Mean differences in lowest-effect level by exposure duration were similar in size to replicate study variance.
  • For most chemicals, administered equivalent doses derived from in vitro methods were within an order of magnitude of organ lowest-effect levels observed in in vivo studies with respect to liver and kidney effects, with larger differences (up to three orders of magnitude) for a smaller number of chemicals.