Contains Nonbinding Recommendations


            Assessing the Credibility of
           Computational Modeling and
           Simulation in Medical Device
                   Submissions
      Guidance for Industry and
  Food and Drug Administration Staff
                     Document issued on November 17, 2023.

         The draft of this document was issued on December 23, 2021.

For questions about this document, contact Office of Science and Engineering Laboratories
(OSEL) by email at OSEL_CDRH@fda.hhs.gov or at (301)-796-2530, or Pras Pathmanathan at
(301) 796-3490 or by email pras.pathmanathan@fda.hhs.gov.


                                          U.S. Department of Health and Human Services
                                                           Food and Drug Administration
                                               Center for Devices and Radiological Health


                                                                                            1
                          Contains Nonbinding Recommendations


                                       Preface
Public Comment
You may submit electronic comments and suggestions at any time for Agency consideration to
https://www.regulations.gov. Submit written comments to the Dockets Management Staff, Food
and Drug Administration, 5630 Fishers Lane, Room 1061, (HFA-305), Rockville, MD 20852-
1740. Identify all comments with the docket number FDA-2021-D-0980. Comments may not be
acted upon by the Agency until the document is next revised or updated.

Additional Copies
Additional copies are available from the Internet. You may also send an email request to CDRH-
Guidance@fda.hhs.gov to receive a copy of the guidance. Please include the document number
GUI01500056 and complete title of the guidance in the request.


                                                                                             2
                                             Contains Nonbinding Recommendations


                                                       Table of Contents

I.         Introduction............................................................................................................................. 4
II.        Background ............................................................................................................................. 5
III. Scope....................................................................................................................................... 7
IV. Definitions............................................................................................................................... 8
V. Generalized Framework for Assessing Credibility of Computational Modeling in a
Regulatory Submission ................................................................................................................. 10
VI. Key Concepts for Assessing Credibility of Computational Modeling in a Regulatory
Submission.................................................................................................................................... 15
      A.         Preliminary steps ............................................................................................................ 15
           (1)      Question of Interest .................................................................................................... 15
           (2)      Context of use (COU)................................................................................................. 16
           (3)      Model risk................................................................................................................... 17
      B.         Credibility Evidence....................................................................................................... 19
           (1)      Code verification results............................................................................................. 22
           (2)      Model calibration evidence ........................................................................................ 23
           (3)      Bench test validation results ...................................................................................... 23
           (4)      In vivo validation results ........................................................................................... 27
           (5)      Population-based validation results ............................................................................ 28
           (6)      Emergent model behavior........................................................................................... 29
           (7)      Model plausibility....................................................................................................... 29
           (8)      Calculation verification/UQ results using COU simulations ..................................... 30
      C.         Credibility Factors and Credibility Goals ...................................................................... 31
      D.         Adequacy Assessment.................................................................................................... 33
Appendix 1. Considerations for Each Credibility Evidence Category ......................................... 36
Appendix 2. Reporting Recommendations for CM&amp;amp;S Credibility Assessment in Medical Device
Submissions .................................................................................................................................. 40


                                                                                                                                                    3
                            Contains Nonbinding Recommendations


             Assessing the Credibility of
            Computational Modeling and
            Simulation in Medical Device
                    Submissions
______________________________________________________________________________

       Guidance for Industry and
   Food and Drug Administration Staff
This guidance represents the current thinking of the Food and Drug Administration (FDA or
Agency) on this topic. It does not establish any rights for any person and is not binding on
FDA or the public. You can use an alternative approach if it satisfies the requirements of the
applicable statutes and regulations. To discuss an alternative approach, contact the FDA staff
or Office responsible for this guidance as listed on the title page.


I. Introduction
FDA has developed this guidance document to assist industry and FDA staff in assessing the
credibility of computational modeling used to support medical device premarket submissions
(i.e., Premarket Approval (PMA) Applications, Humanitarian Device Exemption (HDE)
Applications, Investigational Device Exemption (IDE) Applications, Premarket Notifications
(510(k)s), and De Novo classification requests) or qualification of Medical Device Development
Tools (MDDTs). In the context of this guidance, credibility is defined as the trust in the
predictive capability of a computational model. Computational models can be used in a variety
of ways in medical device regulatory submissions, including to perform 'in silico' device testing
or to influence algorithms within software embedded in a device. Regulatory submissions often
lack a clear rationale for why models can be considered credible for the context of use (COU).
This guidance provides a general risk-informed framework that can be used in the credibility
assessment of computational modeling and simulation (CM&amp;amp;S) used in medical device
regulatory submissions. For the purposes of this guidance, CM&amp;amp;S refers to first principles-based
(e.g., physics-based or mechanistic) computational models, and not statistical or data-driven
(e.g., machine learning or artificial intelligence-based) models. This guidance is intended to help
improve the consistency and transparency of the review of CM&amp;amp;S, to increase confidence in the
use of CM&amp;amp;S in regulatory submissions, and to facilitate improved interpretation of CM&amp;amp;S
credibility evidence submitted in regulatory submissions reviewed by FDA staff. Throughout this


                                                                                                 4
                               Contains Nonbinding Recommendations


guidance, the terms "FDA," "the Agency," "we," and "us" refer to the Food and Drug
Administration and the terms "you" and "yours" refer to medical device manufacturers.

For the current edition of the FDA-recognized consensus standard(s) referenced in this
document, see the FDA Recognized Consensus Standards Database.1 For more information
regarding use of consensus standards in regulatory submissions, please refer to the FDA
guidance titled "Appropriate Use of Voluntary Consensus Standards in Premarket Submissions
for Medical Devices."2

In general, FDA's guidance documents do not establish legally enforceable responsibilities.
Instead, guidances describe the Agency's current thinking on a topic and should be viewed only
as recommendations, unless specific regulatory or statutory requirements are cited. The use of
the word should in Agency guidances means that something is suggested or recommended, but
not required.


II. Background
The use of CM&amp;amp;S (also referred to as in silico methods) in regulatory submissions is well-
established and increasing.3 CM&amp;amp;S of medical devices can streamline development and reduce
burdens associated with premarket device evaluation. It can also reveal important information
not available from traditional in vivo or in vitro assessments, such as serious and unexpected
adverse events that are undetectable within a study sample but occur frequently enough within
the intended population to be of concern. As interest in medical device-related CM&amp;amp;S grows, it
will be important to both monitor current usage and identify areas where CM&amp;amp;S might be more
broadly leveraged to enhance public health. Use of CM&amp;amp;S to support regulatory submissions
necessitates the development of processes and approaches that promote consistency and
transparency in the way CM&amp;amp;S is conducted and reviewed.

There are several ways that CM&amp;amp;S can potentially be used to support a regulatory submission,
including but not limited to:

    1. In silico device testing. Computational models that simulate medical devices can be used
       to generate information supporting device safety and/or effectiveness (e.g., in silico
       durability assessment of an implantable stent). Computational models of the device can
       also be coupled to computational patient models to simulate device performance under
       representative in vivo conditions (e.g., computational electromagnetic models to predict
       energy absorption of metallic implants). Another possibility is that the physical device
       itself is tested on an in silico patient model, for example hardware-in-the-loop testing of a
       physiological closed loop control device, where the therapy actuated by the controller is

1
  https://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfStandards/search.cfm
2
  https://www.fda.gov/regulatory-information/search-fda-guidance-documents/appropriate-use-voluntary-consensus-
standards-premarket-submissions-medical-devices.
3
  Morrison T, Pathmanathan P, Adwan M and Margerrison E. Advancing Regulatory Science With Computational
Modeling for Medical Devices at the FDA's Office of Science and Engineering Laboratories. Frontiers in Medicine,
vol. 5, p. 241, 2018.

                                                                                                              5
                                 Contains Nonbinding Recommendations


       converted into an input to the patient model, and the patient model response is converted
       into a signal passed back to the controller.4
    2. CM&amp;amp;S used within medical device software. Computational modeling may be
       implemented as device software functions,5 for example, device software functions that
       use patient data as inputs to a physics-based computational model to estimate clinical
       biomarkers such as fractional flow reserve, or device software functions that simulate
       patient response during surgery for preoperative planning.
    3. In silico clinical trials. In silico clinical trials are an emerging application of CM&amp;amp;S
       where device safety and/or effectiveness is evaluated using a 'virtual cohort' of simulated
       patients with anatomical and physiological variability representing the indicated patient
       population. In silico clinical trials have a range of possible applications, including but not
       limited to: augmenting or reducing the size of a real world clinical trial,6 providing
       improved inclusion-exclusion criteria, or investigating a device safety concern for which
       a real world clinical trial would be unethical.
    4. CM&amp;amp;S-based qualified tools. CM&amp;amp;S-based tools for developing or evaluating a
       medical device can be submitted to CDRH as a proposal to be considered for the Medical
       Device Development Tools (MDDT) Program7 by the FDA as a non-clinical assessment
       model (NAM) for predicting device safety, effectiveness, or performance (refer to FDA's
       guidance titled "Qualification of Medical Device Development Tools"8).

In all cases, there is a need to demonstrate that the CM&amp;amp;S is credible. For in silico device testing
and in silico clinical trials, final simulation results should be submitted to FDA with supporting
credibility evidence so that FDA can assess the credibility of those simulation results. For
CM&amp;amp;S in medical device software and MDDTs, example simulation results should be submitted
to FDA with supporting credibility evidence so that FDA can assess if future simulations (to be
performed post-market or post-tool qualification) are expected to be credible.

Methodologies for model credibility assessment have been established in the scientific
literature9,10 and continue to evolve. Demonstrating model credibility involves various activities
that include verification, validation, uncertainty quantification, applicability analysis, as well as
adequacy assessment (see Section IV for definitions). The FDA-recognized standard American
Society of Mechanical Engineers (ASME) V&amp;amp;V 40 Assessing Credibility of Computational

4
  Parvinian B, Scully C, Wiyor H, Kumar A, and Weininger S. Regulatory Considerations for Physiological Closed-
Loop Controlled Medical Devices Used for Automated Critical Care: Food and Drug Administration Workshop
Discussion Topics. Anesth Analg., vol. 126(6), p. 1, 2018.
5
  A device software function is a software function that meets the definition of device in 201(h) of the Federal Food,
Drug, and Cosmetic Act. See also https://www.fda.gov/medical-devices/digital-health-center-excellence/device-
software-functions-including-mobile-medical-applications
6
  Haddad T, Himes A, Thompson L, Irony T, Nair R, and MDIC Working Group Participants. Incorporation of
stochastic engineering models as prior information in Bayesian medical device trials, J. Biopharm Stat, vol. 27(6),s
pp. 1089-1103, 2017.
7
  https://www.fda.gov/medical-devices/science-and-research-medical-devices/medical-device-development-tools-
mddt
8
  https://www.fda.gov/regulatory-information/search-fda-guidance-documents/qualification-medical-device-
development-tools
9
  Oberkampf WL and Roy CJ. Verification and Validation in Scientific Computing. Cambridge University Press,
2010.
10
   Roache PJ, Fundamentals of Verification and Validation. Hermosa Publishers, 2009.

                                                                                                                     6
                               Contains Nonbinding Recommendations


Modeling through Verification and Validation: Application to Medical Devices provides a risk-
informed framework for assessing verification, validation, and uncertainty quantification
(VVUQ) activities for computational modeling of medical devices. However, most of the
validation activities defined in ASME V&amp;amp;V 40 assume the ability to perform well-controlled
bench testing to provide data against which simulations' results are evaluated, henceforth
referred to as 'traditional validation evidence.' The possibility of using other, non-traditional
sources of evidence (e.g., clinical studies, robust model calibration results, or population-based
validation results), which may be less controlled but closer to the model context of use, is not
explicitly covered in ASME V&amp;amp;V 40-2018, although recent work has considered how to apply
ASME V&amp;amp;V 40 to patient-specific computational models.11 This guidance uses key concepts of
ASME V&amp;amp;V 40-2018 but provides a more general framework for demonstrating CM&amp;amp;S
credibility in medical device regulatory submissions that incorporate different categories of
credibility evidence.


III. Scope
The purpose of this guidance document is to provide a general risk-informed framework for
assessing CM&amp;amp;S credibility in medical device regulatory submissions that incorporate both
traditional validation evidence and/or other types of supporting data. This guidance document is
applicable to first principles-based models (e.g., physics-based or mechanistic models), such as
models commonly used in electromagnetics, optics, fluid dynamics, heat and mass transfer, solid
mechanics, acoustics, and ultrasonics, as well as mechanistic models of physiological processes.
This guidance is not intended to apply to standalone statistical or data-driven models such as
standalone regression, machine learning or artificial intelligence-based models. We recognize
that there is no clear delineation between first principles and statistical/data-driven models, and
that hybrid models using both methods are possible. For hybrid models, this guidance is intended
to apply to the first-principles model aspects of the hybrid model only; additional considerations
for evaluating statistical/data-driven model aspects are not addressed in this guidance. For
information on appropriate evidence to submit for a statistical/data-driven model, including
machine learning or artificial-intelligence-based models, we recommend manufacturers seek
feedback on their specific device through the Q-submission process (refer to FDA's guidance
titled "Requests for Feedback and Meetings for Medical Device Submissions: The Q-Submission
Program."12) Models that do not involve any simulation, such as purely anatomical models, are
not in scope of this guidance.

This guidance document provides recommendations for both planning and reporting model
credibility assessment activities. This guidance document does not address methodologies for
how to perform modeling studies or technical details for how to gather evidence to support
credibility assessment, nor does it provide recommendations concerning the specific level of
credibility needed to support regulatory submissions. This guidance is not intended to provide a

11
   Galappaththige S, Gray R, Costa C, Niederer S, Pathmanathan P. Credibility Assessment of Patient-Specific
Computational Modeling using Patient-Specific Cardiac Models as an Exemplar, PLOS Computational Biology,
2022.
12
   https://www.fda.gov/regulatory-information/search-fda-guidance-documents/requests-feedback-and-meetings-
medical-device-submissions-q-submission-program

                                                                                                               7
                               Contains Nonbinding Recommendations


comprehensive checklist for all CM&amp;amp;S information to support a regulatory submission. Instead,
this guidance provides a general framework for how to assess CM&amp;amp;S credibility to support
regulatory submissions, and identifies factors a manufacturer should consider when submitting
CM&amp;amp;S credibility evidence We recommend that manufacturers seek feedback on their specific
use of CM&amp;amp;S through the Q-submission process. Where applicable, other device-specific
guidance documents and FDA-recognized standards that include CM&amp;amp;S recommendations may
be used in combination with this guidance document. Also, while the general framework is
expected to be applicable to in silico clinical trials, this is an emerging methodology for which
best practices are still being developed, and this guidance does not provide specific
recommendations for generating virtual cohorts or executing an in silico clinical trial.


IV. Definitions
The definitions listed here are for the purposes of this guidance document and are intended for
use in the context of assessing CM&amp;amp;S credibility.

     Adequacy assessment: for a given context of use (COU), the process of evaluating the
     credibility evidence in support of a computational model, together with any other relevant
     information, possibly including results from the COU simulations, and making a
     determination on whether the evidence is sufficient considering the model risk. See also
     prospective adequacy assessment and post-study adequacy assessment.

     Applicability: "the relevance of the validation activities to support the use of the
     computational model for a context of use"13

     Calculation verification (also called solution verification): "the process of determining the
     solution accuracy of a calculation"14

     Code verification: "the process of identifying errors in the numerical algorithms of a
     computer code"15

     Comparator: the test data that are used for validation, which may include data from bench
     testing,16 in vivo studies, or other empirical data17


13
   Reprinted with permission of The American Society of Mechanical Engineers from ASME V&amp;amp;V 40-2018
Assessing Credibility of Computational Modeling through Verification and Validation: Application to Medical
Devices, copyright ASME, Two Park Avenue New York, NY 10016-5990. All rights reserved. No further copies
can be made without written permission from ASME. Permission is for this edition only. A copy of the complete
standard may be obtained from ASME, www.asme.org.
14
   Reprinted with permission from ASME V&amp;amp;V 40-2018.
15
   Reprinted with permission from ASME V&amp;amp;V 40-2018.
16
   https://www.fda.gov/regulatory-information/search-fda-guidance-documents/recommended-content-and-format-
non-clinical-bench-performance-testing-information-premarket
17
   Adapted with permission from ASME V&amp;amp;V 40-2018.

                                                                                                                8
                             Contains Nonbinding Recommendations


     Computational model: "the numerical implementation of the mathematical model
     performed by means of a computer"18

     Context of use (COU): "a statement that defines the specific role and scope of the
     computational model used to address the question of interest"19

     COU simulations: simulations performed to address the question of interest

     Credibility: "the trust, established through the collection of evidence, in the predictive
     capability of a computational model for a context of use"20

     Credibility evidence: any evidence that could support the credibility of a computational
     model

     Credibility factors: "elements of the process used to establish the credibility of the
     computational model for a COU"21

     Decision consequence: the significance of an adverse outcome resulting from an incorrect
     decision concerning the question of interest22

     Mathematical model: "the mathematical equations, boundary conditions, initial conditions,
     and modeling data needed to describe a conceptual model"23

     Model influence: the contribution of the computational model relative to other contributing
     evidence in addressing the question of interest (e.g., data from bench testing)24

     Model risk: "the possibility that the computational model and the simulation results may
     lead to an incorrect decision that would lead to an adverse outcome"25

     Post-study adequacy assessment: adequacy assessment performed after executing planned
     credibility assessment activities, and potentially also after conducting the COU simulations,
     using results from these activities and any other relevant information

     Prospective adequacy assessment: adequacy assessment performed before executing
     planned credibility assessment activities, using selected credibility goals and any other
     relevant information


18
   Reprinted with permission from ASME V&amp;amp;V 40-2018.
19
   Reprinted with permission from ASME V&amp;amp;V 40-2018.
20
   Reprinted with permission from ASME V&amp;amp;V 40-2018.
21
   Reprinted with permission from ASME V&amp;amp;V 40-2018.
22
   Adapted with permission from ASME V&amp;amp;V 40-2018.
23
   Reprinted with permission from ASME V&amp;amp;V 40-2018.
24
   Adapted with permission from ASME V&amp;amp;V 40-2018.
25
   Reprinted with permission from ASME V&amp;amp;V 40-2018.

                                                                                                     9
                                Contains Nonbinding Recommendations


     Quantity of interest: "the calculated or measured result from a computational model or
     comparator, respectively"26

     Question of interest: "the specific question, decision, or concern that is being addressed"27

     Uncertainty quantification: the process of generating and applying mathematical models to
     provide a measure of uncertainty in the empirical data or simulation results28

     Solution verification: see calculation verification

     Validation: "the process of determining the degree to which a model or a simulation is an
     accurate representation of the real world"29

     Verification: "the process of determining that a computational model accurately represents
     the underlying mathematical model and its solution from the perspective of the intended uses
     of modeling and simulation."30 Code verification and calculation verification are two
     elements of verification.

Note that the terms 'verification' and 'validation' have a variety of meanings in the context of
medical device regulation. The above definitions specifically refer to verification and validation
of a computational model.


V. Generalized Framework for Assessing Credibility of
  Computational Modeling in a Regulatory Submission
FDA recommends the following process when developing and assessing the credibility of
computational modeling used in a medical device regulatory submission. Detailed information
on the key concepts in the framework below are provided in subsequent sections. See Figure 1
for an illustration of the framework using a hypothetical example.

     1. Describe the question(s) of interest to be addressed in the regulatory submission that
        will be informed by the computational model. See Section VI.A.(1) for details.
     2. Define the context of use (COU) of the computational model. See Section VI.A.(2) for
        details.
     3. Determine the model risk. See Section VI.A.(3) for details.
26
   Reprinted with permission from ASME V&amp;amp;V 40-2018.
27
   Reprinted with permission from ASME V&amp;amp;V 40-2018.
28
   Reprinted with permission of The American Society of Mechanical Engineers from ASME VVUQ1-2022
Verification, Validation, and Uncertainty Quantification Terminology in Computational Modeling and Simulation,
copyright ASME, Two Park Avenue New York, NY 10016-5990. All rights reserved. No further copies can be made
without written permission from ASME. Permission is for this edition only. A copy of the complete standard may be
obtained from ASME, www.asme.org.
29
   Reprinted with permission from ASME V&amp;amp;V 40-2018.
30
   Reprinted with permission from ASME V&amp;amp;V 40-2018.

                                                                                                              10
                              Contains Nonbinding Recommendations


     4. Identify and categorize the credibility evidence, either previously generated or planned,
        which will support credibility of the computational model for the COU. See Section VI.B
        for a categorization of different types of credibility evidence.
     5. Define credibility factors for the proposed credibility evidence. For prospectively
        planned activities, set prospective credibility goals for each credibility factor, with a plan
        to achieve these goals. For previously generated data, assess the credibility levels
        achieved. See Section VI.C for a discussion of credibility factors, levels and goals.
     6. Perform prospective adequacy assessment: if the credibility goals are achieved, will the
        credibility evidence be sufficient to support using the model for the COU given the risk
        assessment? See Section VI.D for a discussion of adequacy assessment.
            a. If YES: continue to Step 7. Before proceeding, however, you may wish to utilize
                the Q-submission process (refer to FDA's guidance titled "Requests for Feedback
                and Meetings for Medical Device Submissions: The Q-Submission Program" 31)
                to receive FDA feedback on the computational model, proposed credibility
                evidence, plan for generating this evidence, and prospective adequacy assessment.
                See Appendix 2.
            b. If NO: you may need to modify the model, reduce the model influence, modify
                the COU or revise the plan to generate credibility evidence. See ASME V&amp;amp;V 40
                for a discussion on options. If any changes are made at this stage, go back to Step
                2.
     7. Generate the credibility evidence by executing the proposed study(ies) and/or analyzing
        previously generated data.
     8. Determine if credibility goals were met and perform post-study adequacy assessment:
        does the credibility evidence support using the model for the COU given the risk
        assessment? See Section VI.D for a discussion of adequacy assessment.
            a. If YES: continue to Step 9.
            b. If NO: you may wish to modify the model, reduce the model influence, modify
                the COU or collect additional evidence. See ASME V&amp;amp;V 40 for a more detailed
                discussion of the various options. If any changes are made at this stage, go back to
                Step 2.
     9. Prepare a CM&amp;amp;S credibility assessment report for inclusion in the regulatory submission.
        See Appendix 2 for reporting recommendations.

FDA recommends this generalized framework, but you can choose to use an alternative approach
to demonstrate the credibility of your computational model. If an alternative approach is used,
we recommend that you clearly identify the model's COU within the regulatory submission and
provide a detailed rationale for why the model can be considered credible for its specific COU. If
an alternative approach is planned, we recommend using the Q-submission process to receive
FDA feedback on the planned approach and activities, as outlined in Step 6a above.


Relationship between this framework and ASME V&amp;amp;V 40: The framework above is intended
to be consistent with ASME V&amp;amp;V 40. Table 1 describes the relationship between each step of
the framework and ASME V&amp;amp;V 40. If you plan to perform model validation using well-

31
 https://www.fda.gov/regulatory-information/search-fda-guidance-documents/requests-feedback-and-meetings-
medical-device-submissions-q-submission-program

                                                                                                            11
                           Contains Nonbinding Recommendations


controlled bench test or in vivo experimental data, this guidance framework is fully consistent
with ASME V&amp;amp;V 40 but includes additional recommendations in Sections VI.A and VI.D. For
cases not within the scope of ASME V&amp;amp;V 40 (i.e., use of other categories of credibility evidence
or multiple sources of validation evidence), the framework enables systematic credibility
assessment based on the approach of ASME V&amp;amp;V 40.

Table 1: Relationship between guidance framework and ASME V&amp;amp;V 40.


                                                                                             12
                     Contains Nonbinding Recommendations


Step Topic         Relevant    Comments
                   section of
                   ASME V&amp;amp;V
                   40-2018
1    Question of   Section 2.4 Section VI.A.(1) provides additional
     interest                  recommendations for medical device regulatory
                               submissions
2    Context of    Section 3   Section VI.A.(2) provides additional
     use                       recommendations for medical device regulatory
                               submissions
3    Model risk    Section 4   Section VI.A.(3) provides additional
                               recommendations for medical device regulatory
                               submissions
4    Credibility   N/A         The term "credibility evidence" does not appear in
     evidence                  ASME V&amp;amp;V 40. The categorization of credibility
                               evidence in Section VI.B is unique to this guidance.
                               ASME V&amp;amp;V 40 effectively assumes the following
                               will be generated: code verification results (Cat. 1)
                               with either bench test validation (Cat. 3) or in vivo
                               validation (Cat. 4).
5    Credibility   Section 5   ASME V&amp;amp;V 40 defines credibility factors and
     factors,                  provides example gradations. These may be used for
     gradations                the cases described in the row above. For other cases
     and goals                 (other categories of credibility evidence or multiple
                               sources of validation evidence), users should define
                               appropriate credibility factors. See Section VI.C and
                               Appendix 1 for recommendations.
6    Prospective   Section 6   The term "adequacy assessment" is not explicitly used
     adequacy                  in ASME V&amp;amp;V 40. Prospective adequacy assessment,
     assessment                as defined in this guidance, overlaps with Section 6 of
                               ASME V&amp;amp;V 40. Recommendations for prospective
                               adequacy assessment are provided in Section VI.D
7    Generate      N/A         ASME V&amp;amp;V 40 does not address how to perform
     credibility               credibility activities but similarly incorporates
     evidence                  evidence generation as part of the overall credibility
                               assessment framework (e.g., "Execute plan" in Figure
                               2.4-1 of ASME V&amp;amp;V 40-2018).
8    Post-study    Section 7   The term "adequacy assessment" is not explicitly used
     adequacy                  in ASME V&amp;amp;V 40. Post-study adequacy assessment,
     assessment                as defined in this guidance, overlaps with Section 7 of
                               ASME V&amp;amp;V 40. Detailed recommendations for post-
                               study adequacy assessment are provided in Section
                               VI.D
9    Credibility   Section 8   See Appendix 2 for specific recommendations for
     assessment                information to include in a medical device regulatory
     report                    submission

                                                                                     13
                                                                          Contains Nonbinding Recommendations

Figure 1: Overview of generalized framework for assessing model credibility, with an example for each step. Asterisks (*) indicate credibility
factors that are defined by the user in this hypothetical example, as they are not defined in ASME V&amp;amp;V 40 'Cat.' (in Step 4) denotes credibility
evidence category, as discussed in Section VI.B. Blue boxes are initial steps, yellow boxes are credibility assessment planning steps, red boxes are
adequacy assessment steps, grey boxes are steps related to FDA interaction, and the green box is study execution.
                     Step 1: State question of interest                                                     Step 2: State context of use (COU):
  Refer to           Example (abridged): Is the device family resistant to fatigue fracture under           Example: Finite element analysis will be performed to identify worst-case device sizes        Refer to
  Section VI.A.(1)   anticipated worst-case radial loading conditions?                                      for fatigue fracture. These devices will then be tested on the bench.                         Section VI.A.(2)


                     Step 3: Assess model risk:
  Refer to           1. Decision consequence: e.g., the severity of possible harm is … , probability of occurrence is … , so overall decision consequence is …             Overall risk: choose
  Section VI.A.(3)   2. Model influence: e.g., model results will be a major but not only source of information in making the decision, so model influence is …            from e.g., low to high


                     Step 4: Identify credibility evidence to be collected:            Step 5: State credibility factors:                               Step 5 (continued): State gradations and
                                                                                                                                                        select credibility goals:
                     e.g.                                                                     • Software quality assurance
                                                                                              • Numerical code verification (NCV)
                     Code verification results (Cat. 1): testing to confirm that                                                                               (a) NCV not performed.
                     numerical algorithms and associated code have been                       •   Goodness of fit*                                             (b) Solution compared to a solution
                     correctly implemented without errors                                     •   Quality of experimental data*                                from another verified code.
                                                                                              •   Relevance of calibration results to COU*                     (c) Discretization error quantified by
  Refer to           Model calibration results (Cat. 2): results showing that
                                                                                              •   …                                                            comparison to an exact solution
  Section VI.B       the constitutive model output matches experimental
                                                                                                                                                               (d) Observed order of accuracy
                     stress-strain measurements when material parameters                      •     Model form
                                                                                                                                                               quantified and compared to the
                     are calibrated accordingly.                                              •     Model inputs
                                                                                                                                                               theoretical order of accuracy.
                                                                                              •     Test samples
                     Bench test validation results (Cat. 3): comparison of
                                                                                              •     Test conditions
                     model results with experimental measurements of force-                                                                                    Selected Credibility Goal (based on
                                                                                              •     Equivalency of inputs
                     displacement on the bench.                                                                                                                assessed model risk): level …
                                                                                              •     ...
                     Calculation verification results using COU simulations                                                                  Refer to          Plan for achieving Credibility Goal: …
                     (Cat. 8): mesh convergence analysis using the final COU                  • Discretization error
                                                                                              • Numerical solver error                       Section
                     simulations                                                                                                               VI.C
                                                                                              • Use error

  Refer to           Step 6: Perform prospective adequacy assessment
  Section VI.D       Rationale for why the planned evidence will be sufficient to support using the model for the COU given the risk assessment.                          Rationale           NO        See Section V
                                                                                                                                                                          sufficient?                   for options
  Refer to
                     Optional: Submit pre-submission to receive FDA feedback on proposed plan.                                                                 YES
  Appendix 2

                     Step 7: Generate credibility evidence by executing proposed study(ies) and/or analyzing previously generated data
                     Results and analysis for studies listed above.

                     Step 8: Perform post-study adequacy assessment                                    Rationale            YES     Step 9: Prepare final Credibility Assessment Report
  Refer to                                                                                                                                                                                                  Refer to
                     Rationale for why all the evidence collected supports                             sufficient?                  Report using the recommended structure, summarizing results
  Section VI.D                                                                                                                                                                                              Appendix 2
                     using the model for the COU given the risk assessment.                                                         of previous steps, to be included in the regulatory submission.
                                                                                                              NO
                                                                                            See Section V for options


                                                                                                                                                                                                                         14
                            Contains Nonbinding Recommendations


VI. Key Concepts for Assessing Credibility of
    Computational Modeling in a Regulatory Submission
This section describes and discusses the key concepts used in the framework provided above in
Section V.


       A.      Preliminary steps

       (1) Question of Interest

Step 1 in the framework is to describe the question(s) of interest to be addressed in the
regulatory submission that will be informed by the computational model. We recommend
describing the question of interest following the recommendations of ASME V&amp;amp;V 40 together
with the clarification points below and specific recommendations for medical device regulatory
submissions.

The question of interest concerns the decision to be made with input from the computational
model and potentially other sources of information. The question of interest should not be
confined to the computational model, nor should it be about the computational model. We
recommend that the scope of the question of interest describe the question, decision, or concern
that is being addressed using the computational model and potentially other sources of
information, but nothing more. Therefore, you should avoid overly broad questions of interest
such as, "Is the device safe and effective?" For example, a possible question of interest regarding
device durability could be, "Is the device resistant to fatigue fracture under anticipated worst-
case radial loading conditions?", which might be addressed using a combination of
computational modeling and bench testing. To assist in evaluating the decision consequence
when assessing the model risk in Section VI.A.(3), it can be helpful to formulate the question of
interest in terms of the decision that is to be made and the stakeholder(s) making the decision.

For models used for in silico device testing or in silico clinical trials:
   • The question of interest should describe the specific question, decision or concern being
      addressed about the device, such as in the device durability example stated in the
      preceding paragraph and in Figure 1.

For models used within device software:
   • The question of interest should cover the specific device functionality(ies) that use the
      model predictions. For example, for a device that performs a simulation of a patient as
      part of a diagnostic function, the question of interest may be posed around the clinical
      decision that is to be made, such as whether or not to treat a patient or diagnose the
      presence of a disease condition.

For models submitted for MDDT qualification:

                                                                                                 15
                                 Contains Nonbinding Recommendations


     •   The question of interest should describe the specific question, decision, or concern about
         the range of devices relevant to the proposed MDDT. For example, "For an active
         implantable medical device, what is the in vivo deposited power during a 1.5T magnetic
         resonance (MR) scanning procedure and is it below an acceptable threshold?"

         (2) Context of use (COU)

Step 2 of the framework is to define the context of use (COU) of the computational model. We
recommend defining the context of use following the recommendations of ASME V&amp;amp;V 40
together with the clarification points below and specific recommendations for medical device
regulatory submissions.

The COU statement should include a detailed description of what will be modeled and how
model outputs will be used to answer the question of interest. The COU should also include a
statement on whether other information (e.g., bench testing, animal32 or clinical studies) will be
used in conjunction with the model results to answer the question of interest. For example, a
possible COU regarding device durability could be summarized as "Combine computational
modeling predictions and empirical fatigue testing observations to estimate device fatigue safety
factors under anticipated worst-case radial loading conditions," with additional details provided
to describe the type of modeling used, key model inputs and outputs, and the specific approach
used to combine model predictions with experimental data to answer the question of interest.
Since many models have a range of possible uses, it is important to note that the COU describes
how the model will be used to answer the question of interest, and may be narrower than the
overall model capability.

For models used for in silico device testing or in silico clinical trials:
   • The COU should describe how the model will be used in a simulation study to address
      the question of interest. Note that in this case, the COU differs from the indications for
      use or intended use of the device, although the COU may involve addressing a safety or
      effectiveness question related to the device indications for use or intended use.

For models used within device software:
   • The COU should describe how the model will be used within the device. In this case, the
      COU may be related to the intended use of the device, or a subset thereof, depending on
      how the device uses the simulation results.

For models submitted for MDDT qualification:
   • The model COU is expected to include the MDDT COU information (refer to Section
      IV.A of FDA's guidance titled "Qualification of Medical Device Development Tools"33).

32
   FDA supports the principles of the "3Rs" to replace, reduce, and/or refine animal use in testing, when feasible.
We encourage manufacturers to consult with FDA if they wish to use a non-animal testing method that they believe
is suitable, adequate, validated, and feasible. We will consider if a proposed alternative method could be assessed for
equivalency to an animal test method.
33
   https://www.fda.gov/regulatory-information/search-fda-guidance-documents/qualification-medical-device-
development-tools

                                                                                                                   16
                                 Contains Nonbinding Recommendations


           (3) Model risk

Step 3 of the framework is to determine the model risk. Model risk is assessed because the level
of credibility of a model should be commensurate to the risk associated with using the model to
address the question of interest. We recommend assessing model risk following ASME V&amp;amp;V 40,
which considers model risk as a combination of two factors, model influence and decision
consequence. Below are clarification points and specific recommendations for medical device
regulatory submissions.

Model risk should be interpreted as the risk associated with using the model to address the
specific question of interest, not risk intrinsic to the model. Decision consequence is generally
risk as defined by ISO 14971 Medical devices - Application of risk management to medical
devices, related to the question of interest. Therefore, model risk can be viewed as risk related to
the question of interest, weighted by the influence of the computational model to address the
question of interest.

Model influence is the contribution of the computational model relative to other contributing
evidence (e.g., bench test results, animal or clinical study results) in addressing the question of
interest. For example, evaluating model influence for the aforementioned device durability COU
might consider how much influence CM&amp;amp;S results have on the fatigue resistance decision
relative to the bench fatigue test results.

Decision consequence is the significance of an adverse outcome resulting from an incorrect
decision concerning the question of interest. It is important to note that the decision consequence
is the potential outcome of the overall decision that is to be made by answering the question of
interest, outside of the scope of the computational model and irrespective of how modeling is
used. That is, decision consequence should consider the question of interest, but should not
consider the COU of the model. In regulatory submissions, decision consequence will typically
involve consideration of potential patient harm. For example, when evaluating decision
consequence for the aforementioned device durability example, you should consider the potential
patient harm that could result if the implanted device fractures.

In general, we recommend assessing decision consequence by considering both the potential
severity of harm and the probability of occurrence of harm, as mentioned in ASME V&amp;amp;V 40.
Neglecting probability of occurrence may lead to over-estimating overall model risk and
therefore may seem to warrant a higher level of credibility than needed. We recommend
following an appropriate risk management procedure (e.g., see ISO 14971 and ISO/TR 2497134).
The risk management procedure used should consider any specific hazards that are related to the
question of interest and then identify any possible hazardous situations and the resultant harm
that may occur. Reports of adverse events for the same or similar device types can be helpful in
identifying potential hazards and harms, and estimating their associated rates of occurrence. The
overall decision consequence should be assessed by considering all potential patient harms that
may occur due to an incorrect decision, accounting for any risk mitigation procedures in place.
34
     ISO/TR 24971 Medical devices - Guidance on the application of ISO 14971.

                                                                                                  17
                               Contains Nonbinding Recommendations


We acknowledge that for some cases, assessing probability of occurrence may need estimation or
subject matter expertise (e.g., for some new devices). See Section V.A.2 of FDA's guidance
titled Factors to Consider When Making Benefit-Risk Determinations for Medical Device
Investigational Device Exemptions35 for approaches to estimate probability of occurrence in
these situations.

We note that, while the overall risk of a medical device is a major determinant of the device
classification, decision consequence should be based on the specific question of interest and not
on the specific device class. Therefore, although the overall clinical risk is greater for a class III
device than for a class II device, the decision consequence associated with a specific question of
interest in a 510(k) submission could be the same or even greater than the decision consequence
associated with another question of interest in a PMA application, depending on the specific
question of interest. Accordingly, the decision consequence should be solely determined by
considering the specific question of interest. For CM&amp;amp;S used to support an IDE application,
decision consequence should generally consider the potential harm to trial participants due to
making an incorrect decision concerning the question of interest, taking into account the
proposed study protocol and including any risk mitigation procedures in place.

Following ASME V&amp;amp;V 40, we recommend using a scheme such as illustrated in Figure 2 to
assess model risk considering the combined impact of decision consequence and model
influence.

Figure 2: Possible scheme for assessing model risk considering the combined impact of model
influence and decision consequence. Alternative schemes may be used instead, for example
using a 5x5 or 5x4 grid instead of 3x3.


For models used for in silico device testing or in silico clinical trials:


35
  https://www.fda.gov/regulatory-information/search-fda-guidance-documents/factors-consider-when-making-
benefit-risk-determinations-medical-device-investigational-device

                                                                                                           18
                               Contains Nonbinding Recommendations


     •   Model influence will be dependent on whether other information (e.g., bench or animal
         test results) is also provided in the regulatory submission to address the question of
         interest.
     •   When assessing decision consequence, you should consider device hazards that are
         related to the specific device safety or effectiveness concern that is being addressed, as
         stated in the question of interest.

For models used within device software:
   • Model influence will be dependent on whether other information (e.g., additional direct
      patient measurements, clinical assessments) will be used in answering the question of
      interest. If the device takes action based solely on simulation results, model influence will
      be the highest level. If the simulation results are provided to the clinician to inform a
      decision, model influence will be dependent on other information available and on the
      specific language proposed in the labeling for the device. When determining model
      influence for a device that provides a simulation-based recommendation to a clinician,
      which is intended to be used in conjunction with other medical information to make a
      clinical decision, we recommend you examine if there is reasonably foreseeable misuse36
      related to the degree clinicians may rely on the device output without considering
      additional clinical information that may be available. For example, for a device that
      provides a simulation-based recommendation to a clinician for adjunctive use, model
      influence should account for possible misuse where the clinician relies on the model
      information to a greater degree than intended in the labeling. A model influence of 'zero'
      or 'negligible' should be well-justified when proposed.
   • When assessing decision consequence, device hazards to be considered should be those
      related to the specific device functionality that the model is used for, as stated in the
      question of interest.

For models submitted for MDDT qualification:
   • If the MDDT is a computational model only, model influence is expected to be the
      highest level.
   • Decision consequence should be assessed based on the potential risk to patients should
      the tool, when used as specified in the MDDT COU, provide inaccurate information for
      the question of interest.


         B.      Credibility Evidence

Step 4 of the framework is to identify and categorize the credibility evidence, either previously
generated or planned, which would support credibility of the computational model for the COU.

Not all evidence that could potentially support the use of a computational model in medical
device regulatory submissions comes from traditional validation activities. Because of this, we

36
  See ISO 14971 Medical devices - Application of risk management to medical devices for definition of reasonably
foreseeable misuse.

                                                                                                             19
                            Contains Nonbinding Recommendations


adopt the more general term of "credibility evidence," which is any evidence that could support
the credibility of a computational model. The evidence categories defined below represent results
from different VVUQ activities. Definitions for each of these activities were provided in Section
IV; some clarification points are provided below.

Verification is focused on the software implementation of a numerical algorithm to solve the
underlying mathematical model. It can be broken down into code verification and calculation
verification. Code verification is performed to confirm that numerical algorithms and associated
code have been correctly implemented without errors that affect numerical accuracy and involves
activities such as software quality assurance and numerical code verification; see ASME V&amp;amp;V
40 for details. The aim of calculation verification is to estimate the specific numerical error in
quantities of interest arising from, for example, the chosen spatial discretization. Calculation
verification may be performed any time a simulation is run. For example, calculation verification
can be performed using the validation simulations, that is, using model input values
corresponding to the validation experiment(s). Alternatively, calculation verification can be
performed using the COU simulations, that is, using the COU model inputs.

Validation involves comparison of model predictions with real world observations, referred to
as the comparator. In this guidance, validation is interpreted as comparison against data that is
independent of the data used to create the model. Therefore, model calibration, where parameters
are tuned or optimized so that the model output matches the real-world observations, is not
considered validation. Additionally, comparison of model predictions against predictions from a
different model is not considered validation. A related activity to validation is applicability
assessment, which is assessment of the relevance of the validation activities to the COU.
Differences between how the model is validated and how it is used in the COU may limit the
relevance of the validation activities to the COU.

Uncertainty quantification (UQ) involves estimating the uncertainty in model outputs. Model
output uncertainty can arise from a range of factors, including uncertainties in model inputs or
uncertainty in model form (see ASME V&amp;amp;V 40 for more information on model inputs and model
form). Input UQ is related to sensitivity analysis (SA), which aims to estimate and potentially
rank the influence of model inputs on model outputs and can be performed locally around fixed
input values or globally using input ranges or distributions. SA can support UQ, for example by
reducing the number of inputs with which to perform UQ. However, ultimately it is UQ results –
that is, estimation of the uncertainty in model outputs – that support model credibility. As with
code verification, UQ and SA can be performed using validation simulations, COU simulations,
or both.

In Table 2 below, eight distinct categories of credibility evidence are provided. The objective of
defining these categories is to provide a common framework to characterize the available
evidence to support a computational model. It is not to characterize the quality or level of rigor
of the evidence; the ordering of the categories does not reflect the strength of the evidence. This
categorization is not intended to be exhaustive. In some cases, there may be a need to define new
categories if the credibility evidence does not fit into any of the following categories. For many
computational models, there will likely be evidence from multiple categories that support model
credibility, all of which can be included in a regulatory submission.

                                                                                                20
                            Contains Nonbinding Recommendations


Following Table 2, each category is discussed in more detail, with key distinguishing features
and examples. Specific considerations for each category are also provided in Appendix 1.

Table 2: Eight categories of credibility evidence. Categories 1, 3 and 4 are explicitly within the
scope of ASME V&amp;amp;V 40.
       Category                       Definition
 1     Code verification results      Results showing that a computational model implemented
                                      in software is an accurate implementation of the
                                      underlying mathematical model.
 2     Model calibration              Comparison of model results with the same data used to
       evidence                       calibrate model parameters.
 3     Bench test validation          Validation results using a bench test comparator. May be
       results                        supported by calculation verification and/or UQ results
                                      using the validation conditions.
 4     In vivo validation results     Same as previous category except using in vivo data as the
                                      comparator.
 5     Population-based               Comparison of population-level data between model
       validation results             predictions and a clinical data set. No individual-level
                                      comparisons are made.
 6     Emergent model behavior Evidence showing that the model reproduces phenomena
                                      that are known to occur in the system at the specified
                                      conditions but were not pre-specified or explicitly
                                      modeled by the governing equations.
 7     Model plausibility             Rationale supporting the choice of governing equations,
       evidence                       model assumptions, and/or input parameters only.
 8     Calculation verification       Calculation verification and/or UQ results obtained using
       /UQ results using COU          the COU simulations, that is, the simulations performed to
       simulations                    answer the question of interest

What types of credibility evidence should be included in a regulatory submission? In
accordance with ASME V&amp;amp;V 40, the demonstrated credibility of a computational model should
be commensurate with the risk associated with using the model. We recognize that the ability to
generate credibility evidence may depend upon multiple factors including but not limited to the
type of the model, the maturity of the modeling field, and the ability to perform validation.
Therefore, this guidance document does not prescribe the specific types of credibility evidence
that should be included in a regulatory submission. However, you should consider providing
evidence for each of the following general groups since these evaluate different aspects of the
model:
    • code verification (Category 1);
    • calculation verification (Categories 3, 4 or 8); and
    • validation (Categories 3, 4 or 5) or other evidence pertaining to the model's ability to
        reproduce real-world behavior (Categories 2, 6 or 7).
You can also submit multiple types of evidence within each group (e.g., submitting both bench
and in vivo validation (Categories 3 and 4) if it is appropriate for overall testing of the model
and/or it increases the overall credibility in the model. If you have questions on your planned

                                                                                                 21
                               Contains Nonbinding Recommendations


credibility evidence for your specific model, we recommend that you use the Q-submission
process to obtain feedback.

Examples:
    • In silico device testing:
            o A model of a device that will be used to reproduce a bench test could be
                supported by: code verification results (Category 1), bench test validation results
                (Category 3), and calculation verification results (Category 3 or 8).
    • Models used within device software:
            o A patient-specific modeling algorithm implemented in a medical device could be
                supported by: code verification results (Category 1), in vivo validation results
                (Category 4), and calculation verification results (Category 4).
    • In silico clinical trials:
            o An in silico clinical trial where a device safety/effectiveness question is addressed
                using a virtual cohort of patient models, generated by sampling parameter values
                across the patient population, could be supported by: code verification results
                (Category 1), bench test validation results (to validate the device model; Category
                3), in vivo validation results (to validate the baseline patient model; Category 4),
                calculation verification results (Category 3, 4 or 8), model plausibility evidence
                (to support the sampled parameters; Category 7); and population-based validation
                results (Category 5).
We emphasize that these are examples only, and appropriate evidence will depend on multiple
factors as discussed in the preceding paragraph.


                (1) Code verification results

Code verification results provide evidence demonstrating that a computational model
implemented in software is an accurate implementation of the underlying mathematical model
and associated numerical algorithms. Code verification is important to demonstrate that there are
no bugs in the software that affect simulation numerical accuracy.37 Comparison of model
predictions against real-world observations is not part of code verification and is addressed
separately by validation activities.

Example:
      • For solid mechanics, fluid dynamics, heat transfer, electromagnetism, and other
         domains involving partial differential equations: results comparing the computational
         model against analytical solutions (e.g., generated using the method of manufactured
         solutions38), including confirmation that the error converges to zero at the expected
         convergence rate as spatial and temporal discretization size are decreased.


37
   Salari K and Knupp P. Code verification by the method of manufactured solutions (No. SAND2000-1444), Sandia
National Lab, 2000.
38
   Aycock KI, Rebelo N and Craven BA. Method of manufactured solutions code verification of elastostatic solid
mechanics problems in a commercial finite element solver. Computers &amp;amp; Structures, vol. 229, p. 106175, 2020.

                                                                                                           22
                            Contains Nonbinding Recommendations


               (2) Model calibration evidence

Model calibration evidence is the comparison of model results with the same data used to
calibrate model parameters. The evidence is an assessment of the "goodness of fit" of simulation
results using calibrated model parameters. This is not validation evidence because it is not testing
of the final model against data independent of model development; instead, model parameters are
calibrated (whether optimized or manually tuned) to minimize the discrepancy between model
results and data. Calibration evidence is weak in comparison to validation evidence.
Nevertheless, robust model calibration evidence can still support model credibility. When the
same amount of data are used, this type of evidence is stronger if complex behavior is
reproduced after calibrating a small number of parameters in a first principles model. This type
of evidence is weaker if the governing equations were chosen solely based on the data rather than
underlying principles, or if many parameters were calibrated.

Calibration evidence could be generated for the overall model or for sub-models within the
overall model; examples of both are provided below. When the overall model needs calibration
of some of its parameters, the calibration results could provide relevant credibility evidence,
generally supplementary to separate validation of the overall model. When a sub-model needs
calibration to determine the value of sub-model parameters, the calibration results can be
important for justifying use of those parameter values, and to provide confidence in the
predictions when sub-model dependent variables (e.g., strains) will be extrapolated past values
used in validation simulations.

Examples of overall model calibration:
      • In physiological modeling, demonstrating that a patient-specific model of a patient's
          heart closely matches the patient's clinically measured pressure-volume (P-V) loop,
          after tissue parameters have been calibrated based on the same P-V loop data.
      • In heat transfer modeling, demonstrating that the first principles-based model
          accurately reproduces spatio-temporal in vivo tissue heating in different tissues after
          calibrating the blood-tissue heat transfer coefficient to match the same thermal
          measurements.
Example of sub-model calibration:
      • In solid mechanics, demonstrating that a constitutive model of a material closely
          matches a test specimen's measured stress-strain behavior across a wide range of
          strains, after calibrating constitutive parameters to minimize the discrepancy.


               (3) Bench test validation results

This category refers to validation results using experimental data from bench testing, not clinical
or animal testing (for the latter see Category 4 below). Bench tests are typically performed under
well-controlled laboratory conditions, making them advantageous for simulation validation. Note


                                                                                                 23
                               Contains Nonbinding Recommendations


that 'bench testing' is a broad term that encompasses in vitro, cadaveric and other types of non-
clinical testing.39

Bench test validation results could be supported by calculation verification and/or UQ results
using the validation simulations (as opposed to calculation verification and/or UQ results using
the COU simulations; see Category 8).

For this type of evidence, either the validation simulations or the bench tests can be prospectively
planned or previously generated. As shown in Table 3. This leads to three common cases:
prospectively planned validation activities, validation against retrospective experimental
datasets, or previously generated validation results. In addition, although the validation involves
bench testing, the COU itself could be either bench or in vivo. Examples of potential
combinations are provided below.

Examples using prospectively planned validation activities:
      • In the following example, both the COU and the validation simulations correspond to
          bench testing:
              • In solid mechanics, a manufacturer of a new family of cardiovascular implants
                  plans to perform bench durability testing to assess fatigue resistance. A
                  computational model of the device family is developed, and simulations of the
                  bench test are used to select worst-case device sizes to minimize the number
                  of bench test articles needed. Validation with supporting calculation
                  verification evidence is generated by performing finite element simulations of
                  loading for a subset of the devices using multiple finite element mesh
                  resolutions and comparing model-predicted and bench-measured quantities of
                  interest.
      • In the following example, the COU corresponds to in vivo conditions, but the
          validation simulations correspond to bench testing:
              • In electromagnetics, a manufacturer of a new implantable device plans to
                  assess induced power density during MR imaging using a computational
                  model of the device implanted in anatomical models of a set of virtual
                  patients. The computational model predicts energy absorption during MR
                  scanning. For validation, physical experiments using the same device in a gel
                  phantom tank are compared to simulation results using an in silico model of
                  the device in a simulated gel phantom tank.

Example using validation against retrospective datasets:
      • In fluid dynamics, a manufacturer uses computational fluid dynamics to assess the
          performance of a blood-contacting device. The manufacturer compares simulations
          with classical hydrodynamic laboratory measurements (e.g., flat-plate boundary layer,
          lift and drag on objects) or other benchmark experiments designed for validation


39
  See also FDA guidance, "Recommended Content and Format of Non-Clinical Bench Performance Testing
Information in Premarket Submissions" available at https://www.fda.gov/regulatory-information/search-fda-
guidance-documents/recommended-content-and-format-non-clinical-bench-performance-testing-information-
premarket

                                                                                                            24
                              Contains Nonbinding Recommendations


            (e.g., a benchmark nozzle or blood pump40). Although the validation dataset is not
            specific to the COU, the validation exercise provides evidence that the model
            accurately predicts hydrodynamic behavior that are generally relevant to the COU.

Examples using previously generated validation results:
      • In solid mechanics, a manufacturer previously developed a computational model of a
          family of peripheral stents, validated the model by comparing predicted and measured
          force-displacement relationships under radial loading on the bench, and then used the
          model to identify worst-case stent sizes to reduce the number of samples that
          underwent durability testing. Subsequently the manufacturer seeks a new indication
          for the same stents in different vasculature. A computational model of the stents in
          the new loading conditions is developed. The previously collected validation results
          may support the credibility of the model under the new loading conditions associated
          with the new indications.
      • In electromagnetics, a computational model of MR-induced heating near an
          implantable device was previously developed, validated, and used to generate
          evidence to support conditions of safe use of the device for 3T MR machines.
          Subsequently, the same model is used to support conditions of safe use of the device
          for 7T MR machines. The previous validation results may support the model for this
          new COU for known transmit coil configurations.


40
  Malinauskas RA, Hariharan P, Day SW, Herbertson LH, Buesen M, Steinseifer U, Aycock KI, Good BC, Deutsch
S, Manning KB and Craven B. FDA Benchmark Medical Device Flow Models for CFD Validation. ASAIO J, vol.
63(2), pp. 150-160, 2017.


                                                                                                       25
                          Contains Nonbinding Recommendations


Table 3: Comparison of three common validation cases based on whether the validation
simulations are prospectively planned or previously generated (rows) and whether the
comparator data are prospectively planned or previously generated (columns).

                  Prospectively planned            Previously generated comparator data
                  comparator data
    Prospectively Corresponds to prospectively     Corresponds to validation against
    planned       planned validation activities.   retrospective data. Validation simulations
    validation        • possible to select         need to be planned to match the
    simulations          experiments and           comparator. Examples include comparison
                         simulations to            against literature experimental data or
                         maximize relevance        benchmark datasets.
                         to COU                        • limited control over relevance of
                         (applicability)                   the validation activities to the
                      • possible to quantify               COU; applicability may be low
                         uncertainties in              • limited control over ASME V&amp;amp;V
                         simulation results                40 comparator credibility factors
                      • possible to quantify           • possible to quantify uncertainty in
                         comparator                        simulation results
                         measurement error             • comparator measurement error and
                         and uncertainty                   uncertainty may not be available
                      • method of                      • method of comparison can be
                         comparison can be                 chosen
                         chosen
    Previously    Very uncommon                    Usually corresponds to previously
    generated                                      generated validation results; for example,
    validation                                     for a previous COU with a similar model
    simulations                                    (e.g., in a previous regulatory submission),
                                                   or general model validation results
                                                   published in the literature.
                                                       • limited ability to select
                                                            experiments and simulations to
                                                            maximize relevance to COU;
                                                            applicability may be low
                                                       • no/limited control over ASME
                                                            V&amp;amp;V 40 validation credibility
                                                            factors
                                                       • uncertainties in simulation results
                                                            may not be available
                                                       • comparator measurement error and
                                                            uncertainty may not be available
                                                       • no ability to choose method of
                                                            comparison unless raw data are
                                                            available


                                                                                              26
                            Contains Nonbinding Recommendations


               (4) In vivo validation results

This category refers to validation results using in vivo data as the comparator, in the form of
either clinical or animal data. This category assumes subject-level comparison between
simulation and comparator when data from one or multiple subjects are available (population-
level comparison falls under Category 5). Therefore, this category applies to patient-level
validation of a patient-specific computational model, for example, a clinical trial evaluating the
performance of a medical device that uses patient-specific computational simulation.

The validation results could be supported by calculation verification and/or UQ results using the
validation simulations (as opposed to calculation verification and/or UQ results using the COU
simulations; see Category 8).

In this type of evidence, either the validation simulations or the in vivo comparator data can be
prospectively planned or previously generated. As shown in Table 3, this leads to three common
cases: prospectively planned validation activities, validation against retrospective datasets, or
previously generated validation results. Some examples are provided below.

Examples using prospectively planned validation activities:
      • In fluid dynamics, a clinical software tool, which uses a physics-based patient-
          specific model of the coronary arteries to predict the fractional flow reserve (FFR), is
          validated by comparing simulations against invasive clinical FFR measurements in
          the same patient. A calculation verification study may also be performed to estimate
          the numerical uncertainty in these simulations.
      • A manufacturer develops a computational model-based tool that predicts a
          quantitative clinical metric with a known correlation to patient outcomes. The
          manufacturer validates the predictive capability of the tool by performing a clinical
          trial and computing sensitivity, specificity, positive/negative predictive value, and
          area under receiver operating characteristic (ROC) curve.
      • In bioheat transfer, a first principles-based thermal model is developed to predict in
          vivo tissue heating due to a device (e.g., devices based upon delivering ultrasound,
          laser, radiofrequency (RF) energy). The model is validated using humans and/or
          animal models in relevant tissues for appropriate spatio-temporal distribution of in
          vivo power density, by making direct measurements using temperature probes.

Examples using previously generated validation results:
      • In solid mechanics, a manufacturer uses a computational model to compute
          displacements for one device (e.g., shoulder arthroplasty) under simulated in vivo
          conditions (e.g., rotations), performs a supporting calculation verification study, and
          validates the predictions against relevant in vivo data. Later, the manufacturer wishes
          to use a similar model for a different device (e.g., reverse shoulder arthroplasty). The
          previous validation and calculation verification may support the credibility of the new
          device model.


                                                                                                 27
                             Contains Nonbinding Recommendations


       •   In bioheat transfer, a first principles-based thermal model is developed to predict in
           vivo tissue heating due to a device (e.g., devices based upon delivering ultrasound,
           laser, RF energy). The model was validated using humans and/or animal models in
           relevant tissues for appropriate spatio-temporal distribution of in vivo power density,
           by making direct measurements using temperature probes. Later, the manufacturer
           wishes to use the same model for a different device. If the nature of the spatio-
           temporal temperature distribution (i.e., magnitude and gradients in space and time) is
           expected to be comparable between two devices for the full range of device
           specifications in the tissue of interest, the previous validation evidence may be able to
           support the credibility of the model for predicting in vivo tissue heating due to the
           second device.

Another possibility for previously generated validation results occurs for general-purpose or
multi-application computational models for which it is common to compare model predictions
under a variety of conditions with experimental data. With computational models of
physiological systems, it is common to show the model can reproduce the range of physiological
behaviors when publishing or releasing the model. Those validation results could be relevant if
the physiological model is later used in a medical device COU. For example:
       • In physiological modeling, a model of the cardiovascular system is developed and
           then validated by comparing model predictions of various hemodynamic variables
           (e.g., mean arterial blood pressure, cardiac output) against recordings from patients
           throughout a range of normal and pathological conditions. A manufacturer of a
           physiological closed loop control (PCLC) device that uses the model for in silico
           testing of the control algorithm could potentially utilize the previous validation results
           to support the model credibility in a PCLC testing COU.


               (5) Population-based validation results

Population-based evidence consists of comparisons of population-level data between model
predictions and a clinical data set, or potentially other data such as animal or cadaveric data. A
distinguishing feature of this evidence is that multiple subjects are involved, but comparison of
simulation results and experimental data for the same subject is not performed (i.e., no
comparison is made on a patient-level basis; such evidence falls under Category 4). For example,
this type of evidence is relevant to validation of 'virtual populations' or 'virtual cohorts,' that is,
multiple patient models representing a patient population. Population-based evidence for
credibility of the virtual population/cohort could be generated by comparing the mean and
standard deviation of a model output across the virtual population/cohort with the mean and
standard deviation from a clinical dataset. Population-level clinical trial results would be a part
of this category, whereas patient-level clinical trial results fall in Category 4.

Examples:
      • In medical imaging, a set of virtual patients is generated by taking an
          anthropomorphic model of a breast and of lesions and varying key parameters across
          expected ranges. Comparison of model predictions to individual patient data is not
          possible because none of the virtual patients correspond to any one actual patient.

                                                                                                    28
                              Contains Nonbinding Recommendations


            Instead, the results of the computer-simulated trial are statistically compared to
            clinical outcomes to demonstrate that the predictions are consistent with the
            comparative trial using human subjects and human image interpreters.41


                (6) Emergent model behavior

Emergent model behavior is evidence that demonstrates that the finalized computational model
reproduces phenomena that are known to occur in the system at the specified conditions but were
not pre-specified or explicitly modeled by the governing equations. A distinguishing feature of
this type of evidence is that simulation results are not directly compared to specific data. Instead,
simulation results are assessed using scientific knowledge about the system, possibly based on
qualitative experimental observations. This type of evidence is especially relevant to models of
physiological systems, because physiological systems often exhibit emergent behavior that is not
predictable from knowledge on sub-systems.

Examples:
      • In fluid dynamics, a computational model of blood flow through a stenotic vessel is
          developed, and evidence is collected to confirm the model correctly predicts the onset
          of transitional or turbulent flow at conditions where such phenomena are expected. A
          manufacturer that uses the model to predict clinical metrics related to stenosis
          severity and ischemia could include this information as credibility evidence.
      • In cardiac electrophysiology, a model of electrical activity in the heart and torso is
          developed. It is demonstrated that each simulated electrocardiogram (ECG) in the
          standard 12-lead ECG has the same morphology as clinical ECGs, in terms of relative
          size and direction of the P-wave, QRS-complex and T-wave. A cardiac device
          manufacturer that uses this model for in silico testing of their device could include
          this information as credibility evidence for the cardiac model.


                (7) Model plausibility

Model plausibility evidence is the rationale supporting the choice(s) of governing equations,
model assumptions, and/or input parameters. A claim of model plausibility is an argument that
the model is credible because the governing equations are expected to hold, assumptions are
justified, and parameters and other quantities that are input into the model have been justified. A
distinguishing feature of this category is that simulations do not need to be run to generate this
kind of evidence, because the evidence is based on knowledge about the model, and not on a
comparison of model results to data. Since this evidence does not involve testing or assessing the
finalized model (i.e., no verification or validation), model plausibility might be the first step in
supporting model credibility, but it is generally a weak form of credibility evidence. In some


41
  Badano A, Graff CG, Badal A, Sharma D, Zeng R, Samuelson FW, Glick SJ and Myers KJ. Evaluation of Digital
Breast Tomosynthesis as Replacement of Full-Field Digital Mammography Using an In Silico Imaging Trial. JAMA
Netw Open, vol. 7(1), 2018.

                                                                                                         29
                             Contains Nonbinding Recommendations


cases where it is very difficult to obtain any experimental data from the system of interest for
validation, this may be a primary form of evidence to support model credibility.

Example:
      • In solid mechanics, a finite element model of a simple joint arthroplasty device is
         developed. For the particular combination of implant design, implant material, and
         loading conditions considered, deformations are anticipated to be well within the
         linear-elastic regime. The mechanical behavior of the implant material is also well
         characterized and has been shown to be approximately isotropic at the length scales
         of interest. Accordingly, plausibility evidence could support the credibility of the
         implant material model, i.e., a linear elastic model with an isotropic constitutive law,
         supported by justification for the specific material parameters used. The credibility of
         the whole model could be supported using plausibility evidence if valid rationales for
         the governing equations, model assumptions, and input parameters can be made.


               (8) Calculation verification/UQ results using COU simulations

This category refers to standalone calculation verification and/or UQ results performed using the
COU simulations, which are the simulations performed to answer the question of interest using
the COU conditions. Direct validation of the COU simulations is not possible, because if
comparator data was available for the COU there would be no need for the model. However,
calculation verification or UQ analyses are possible using these simulations.

This type of evidence applies to in silico device testing or in silico clinical trials, but not models
in device software or in MDDTs, for which the COU simulations are run after the device is on
the market or MDDT qualified.

Examples:
      • A finite element model of a medical device is developed to identify worst-case
          configurations related to a device safety concern. For validation, model predictions
          were compared to bench test data. A mesh convergence study was performed to
          confirm the numerical error due to spatial discretization is acceptably small (Category
          3 evidence). However, for the COU a different quantity of interest will be analyzed
          than that considered in the validation study. There is reason to believe a finer
          computational mesh is needed to resolve this quantity of interest. Therefore, a new
          mesh convergence study is performed for this quantity of interest using the COU
          conditions.
      • In fluid dynamics, a computational model of blood flow through a ventricular assist
          device is used to assess the influence of a planned change in manufacturing tolerances
          on hemolysis. Simulations were previously validated using a single well-
          characterized device at multiple operating conditions, by comparing with
          measurements of the velocity field and the corresponding flow-induced stress from
          particle image velocimetry. To address the question of interest, simulations are
          performed with accompanying UQ to analyze the influence of the planned change in
          manufacturing tolerances. In the UQ study, the device dimensions are varied within

                                                                                                    30
                            Contains Nonbinding Recommendations


            the range of the manufacturing tolerances. This input geometric uncertainty is
            propagated through the model using Monte Carlo sampling to perform a large number
            of simulations to quantify the influence of geometric variances on the predicted flow-
            induced stress and blood exposure time in the device, which are closely related to
            hemolysis. Two separate UQ studies are performed for the original and the proposed
            manufacturing tolerances to justify that the planned change has a negligible influence
            on the hemolytic potential of the device.


       C.      Credibility Factors and Credibility Goals

Step 5 in the framework is to define credibility factors for the planned credibility evidence and
set credibility goals for each credibility factor, with a plan to achieve these goals.

See ASME V&amp;amp;V 40 for an introduction to credibility factors. As an example, ASME V&amp;amp;V 40
defines two credibility factors for code verification: 'Software quality assurance' and 'Numerical
code verification.' Other credibility factors are similarly defined in ASME V&amp;amp;V 40 for
calculation verification, validation and applicability.

To establish credibility factors and credibility goals, we recommend the following sub-steps for
Step 5. Refer to Figures 1 and 3 for examples.
        • Step 5.1: State credibility factors relevant to the type of credibility evidence you plan
           to gather. When relevant, we recommend using ASME V&amp;amp;V 40 credibility factors.
           For example, if you plan to gather bench test validation results (Category 3), we
           recommend using ASME V&amp;amp;V 40 credibility factors related to validation and
           applicability. For evidence categories that are not explicitly covered by ASME V&amp;amp;V
           40 (e.g., model calibration evidence, population-based evidence, or model plausibility
           – Categories 2, 5 or 7, respectively), we recommend defining new credibility factors.
           For example, if model calibration results will be used in support of model credibility,
           you could define a 'Goodness of fit' credibility factor, among others.
               • See also Appendix 1 for specific considerations for each category of
                   credibility evidence, including suggested credibility factors.
               • If there are multiple forms of credibility evidence from different categories,
                   with one set being used as the 'primary' source of evidence and other sets as
                   'secondary' or 'supporting' evidence (e.g., in vivo validation results as
                   primary and bench test validation results as secondary), we recommend using
                   ASME V&amp;amp;V 40 credibility factors when possible for the primary evidence
                   and an appropriately limited set of credibility factors for the supporting
                   evidence. This is to avoid an excessive total number of credibility factors
                   when results from multiple categories are used to support the overall
                   credibility of the model.
               • Since the relevance of the evidence to support using the model for the COU is
                   especially important, we recommend defining a 'Relevance to the COU'
                   credibility factor(s) for each set of credibility evidence (as emphasized in
                   Appendix 1). For validation evidence, this is termed 'applicability' (see
                   Definitions section).

                                                                                                31
                             Contains Nonbinding Recommendations


       •   Step 5.2: Following ASME V&amp;amp;V 40, for each credibility factor, define a gradation of
           activities that describes progressively increasing levels of rigor in investigation. For
           example, for a 'Goodness of fit' credibility factor for model calibration evidence
           (Category 2), a possible gradation is:
           a) Qualitative comparison of fit performed.
           b) Quantitative error of fit computed without accounting for any uncertainty.
           c) Uncertainty in fitted parameters (e.g., due to experimental noise) is estimated and
               accounted for in the quantitative error of fit.
       •   Step 5.3:
               • Following ASME V&amp;amp;V 40, for each credibility factor corresponding to
                   prospectively planned activities:
                          • Select a 'credibility goal' from the gradation, considering the model
                            risk as assessed in Step 3. Higher risk questions of interest generally
                            warrant higher-level credibility goals. It is important to note that in this
                            step, a level of credibility is being proposed for each factor that will
                            contribute to the overall credibility of the model. See ASME V&amp;amp;V 40
                            for examples.
                          • If the goal is less than the level commensurate with model risk (see
                            Figure 3), for example due to practical constraints, you should provide
                            a rationale for why the activities are sufficient.
                          • Describe a high-level plan to achieve the proposed credibility goal.
                            This should be included in the prospective credibility assessment to
                            justify the level of credibility that is being proposed.
               • For each credibility factor corresponding to previously generated data (e.g.,
                   ASME V&amp;amp;V 40 'comparator' credibility factors in the case of validation
                   using a retrospective dataset):
                          • Identify which level from the gradation the previously performed
                            activities correspond to.
                          • If the assessed credibility level is less than the level commensurate
                            with model risk, you should provide a justification for why the
                            activities are sufficient.

Figure 3 presents a hypothetical example of this process. In this example, two types of credibility
evidence are planned, code verification results (Category 1) and prospectively planned bench test
validation results (Category 3). In this example, the Category 3 evidence includes both validation
and supporting calculation verification results. Model risk was assessed to be Low-Medium.
ASME V&amp;amp;V 40 credibility factors are used, and a five-level gradation was defined to grade each
credibility factor. Credibility goals were chosen for each factor as indicated in Figure 3. For
credibility factors for which the goal corresponds to a credibility level that is not commensurate
with model risk (i.e., the three credibility factors with level 'low'), a rationale should be provided
for why the activities are sufficient.


                                                                                                     32
                               Contains Nonbinding Recommendations


Figure 3: Hypothetical example of setting credibility factor goals. In this example all activities
are assumed to be prospectively planned.


        D.      Adequacy Assessment

Steps 6 and 8 of the framework assess the adequacy of the credibility-related activities and
results. Step 6 is a prospective adequacy assessment, which asks the question: if the credibility
goals are achieved, will the credibility evidence be sufficient to support using the model for the
COU given the risk assessment? Step 8 is a post-study adequacy assessment and asks the
question: does the available credibility evidence support using the model for the COU given the
risk assessment? Note that adequacy assessment is different from applicability: as per Section V,
applicability refers to the relevance of validation activities to the COU but adequacy assessment
considers the totality of the credibility evidence. Also, in contrast to model accuracy, which is
quantifiable through validation, model adequacy warrants a careful decision to be made using
engineering and clinical judgement, based on all available information.42

Performing the prospective adequacy assessment (Step 6) is recommended if you plan to request
FDA feedback on planned activities via a pre-submission (as described in Step 6 in Section V) to
facilitate the evaluation of your proposed rationale for credibility of the computational model. If
performing prospective adequacy assessment, we recommend that you consider the planned
credibility evidence, the proposed credibility goals for each credibility factor, and any other
relevant information. The prospective adequacy assessment should include a rationale for why
the planned credibility evidence is expected to be sufficient to support using the model for the
COU, given the risk assessment.

42
  Oberkampf WL and Roy CJ. Verification and Validation in Scientific Computing. Cambridge University Press,
2010.

                                                                                                              33
                            Contains Nonbinding Recommendations


When performing post-study adequacy assessment (Step 8), we recommend that you first re-
evaluate the credibility level that was achieved for each credibility factor and whether the
credibility goal was met. The post-study adequacy assessment should also include a rationale for
why the credibility evidence is sufficient to support using the model for the COU, given the risk
assessment. Post-study adequacy assessment can also use the COU simulation results, if
available, and related information such as the difference between COU model predictions and
safety thresholds (see example below). We recommend that you take into consideration the
following questions and recommendations in the post-study adequacy assessment:

Questions:
       • Have all relevant features of the model been adequately tested? That is, do the
           verification, validation and any other credibility evidence sources cover all features of
           the model relevant to the COU? For example:
               • For models used within device software, have all model-derived device
                   outputs been evaluated as part of the credibility assessment process?
       • Were the credibility goals met? If the goal was not met for a factor(s), we recommend
           that you provide a justification for why the impact of the unmet credibility factor(s)
           on the risk (associated with using the model to address the question of interest) is
           acceptable.

Recommendations:
     • You may wish to pre-specify quantitative accuracy targets for the model validation
        comparison, such that the model will be considered adequate if the accuracy targets
        are met. Since quantitative accuracy targets will be application-specific, you should
        still provide a rationale explaining why this level of accuracy is sufficient to support
        using the model for the COU. Note that even if pre-specified quantitative accuracy
        targets for model validation were not met, it may still be possible to use the model for
        the COU if a valid rationale can be provided, such as based on further analysis. We
        also recognize that it is not always possible and/or meaningful to pre-specify precise
        quantitative accuracy targets. In this case, we recommend you pre-specify how you
        intend to assess the level of agreement between the model results and the validation
        data.
     • When the question of interest includes information concerning a decision or safety
        threshold, then as part of the adequacy assessment we recommend considering the
        proximity of model predictions relative to such thresholds. That is, how close is the
        model prediction to the decision or safety threshold? As part of this assessment, it
        may also be useful to consider estimates of uncertainty in the COU predictions (e.g.,
        based on uncertainty quantification, calculation verification results, and/or model
        accuracy from the validation comparison) and, if applicable, uncertainty in the value
        of the decision or safety threshold. Such considerations could be used to further
        support the adequacy of the model for addressing the question of interest. For
        example:
             • For a computational model of MR-induced energy absorption of an
                 implantable metallic device, suppose the COU simulations predict that the
                 power deposited into the surrounding tissue is well within acceptable levels,

                                                                                                 34
                            Contains Nonbinding Recommendations


                   and moreover, the uncertainty in predicted power, based on uncertainty
                   quantification and validation, is small. Overall, the 99% confidence interval
                   for power deposited into the surrounding tissue is well within acceptable
                   levels. This information could be used to further justify the adequacy of the
                   model credibility assessment activities for addressing the question of interest.
       •   It is important to explicitly state any limitations of the model and provide a rationale
           for why they do not reduce confidence in using the model for the COU, referring to
           the credibility evidence or other scientific knowledge as appropriate.

If you determine the evidence to be insufficient in either the prospective or post-study adequacy
assessment, we recommend that you consider modifying the model, reducing the model
influence, modifying the COU, and/or revising the plan to generate credibility evidence
(prospective adequacy assessment) or collecting additional evidence (post-study adequacy
assessment). See ASME V&amp;amp;V 40 for a discussion on these different options.


                                                                                                 35
                               Contains Nonbinding Recommendations


Appendix 1. Considerations for Each Credibility Evidence
Category

Below are considerations regarding the generation and/or evaluation of credibility evidence, for
each category of evidence in Section VI.B. Some of the following considerations may not be
applicable depending on specific details of the modeling performed.

Category 1: Code verification results
   • For Step 5 of the framework, we recommend using the credibility factors for code
      verification defined in ASME V&amp;amp;V 40.
   • For computational models implemented within medical device software, note that
      software and model verification and validation are both important but differ in scope and
      definition. Testing performed for software verification may include code verification of
      the computational model, although the latter is typically addressed separately and may
      need consideration of the specific COU. See software verification and validation
      reporting recommendations in FDA's guidance titled "Guidance for the Content of
      Premarket Submissions for Software Contained in Medical Devices"43 and refer to the
      appropriate tests when describing model code verification activities.
   • For computational models that are not part of the device (e.g., in silico device testing, in
      silico clinical trials), code verification for the model is unrelated to the device software
      verification and/or validation and is therefore performed separately from device software
      verification and validation.
   • For computational models that are not part of the device (e.g., in silico device testing, in
      silico clinical trials), if a commercial software package was used to develop the
      computational model, we recommend referring to any information provided by the
      software manufacturer on software quality assurance and code verification, as relevant.

Category 2: Model calibration evidence
   • For Step 5 of the framework, consider defining credibility factors related to goodness of
      fit, quality of the comparator data, and relevance of calibration activities to the COU.
   • Be cautious not to present or confuse calibration evidence as/with validation evidence
      and ensure that data for calibration is separate or not inclusive of data used for validation.
   • Consider evaluating whether final values of all calibrated parameters that have a
      physical/physiological meaning are within expected physical/physiological ranges.
   • Consider quantifying the 'goodness of fit.'
   • When reporting calibration results, we recommend that you provide details on the
      following (if applicable):
           • calibration procedure, including which parameters were calibrated;
           • prior distributions for these parameters if a Bayesian calibration approach was
               used;
           • details of the simulations run, source and details of experimental/comparator data;

43
  https://www.fda.gov/regulatory-information/search-fda-guidance-documents/guidance-content-premarket-
submissions-software-contained-medical-devices

                                                                                                         36
                                 Contains Nonbinding Recommendations


             • any steps taken to ensure the model is not overfitted; and
             • numerical methods for obtaining the calibrated results.
     •   As discussed in Section VI.B, model calibration evidence is weaker than validation
         evidence. Therefore, if model calibration evidence is provided as the primary source of
         credibility evidence, you should provide a rationale for why validation testing of the
         model is not possible or warranted, for example, referring to the assessed model risk.
     •   If no validation results are available and calibration results are the primary source of
         evidence for model credibility, consider evaluating the relation between calibration
         conditions and COU conditions, and between calibration quantities of interest and COU
         quantities of interest.

Category 3: Bench test validation results
   • For Step 5 of the framework, we recommend using credibility factors defined in ASME
      V&amp;amp;V 40.
   • If the COU will involve making in vivo predictions, we recommend paying special
      attention to the applicability of the bench test validation results to the in vivo COU.
   • For prospectively planned validation:
          • If possible, we recommend considering that the computational analyst(s)
              performing the simulations be blinded to the bench test validation data to prevent
              the potential for bias.44
   • For validation against retrospective datasets:
          • We recommend that you pay special attention to the applicability of validation
              results to the COU, since the comparator data were not designed for validating the
              model for the current COU.
   • For previously generated validation results:
          • We recommend that you pay special attention to the applicability of previously
              generated validation results to the COU, since the previous validation results were
              not designed to support the model for the current COU. This should include an
              assessment of any differences, and the impact thereof, between the model used in
              the previous validation results and the current model.

Category 4: In vivo validation results
   • For Step 5 of the framework, if the evidence is traditional validation evidence, we
      recommend using credibility factors defined in ASME V&amp;amp;V 40.
   • If the evidence takes another form (e.g., clinical trial results), we recommend that you
      generate and evaluate the evidence using the appropriate best practices and methods (e.g.,
      appropriate statistical techniques, appropriate measures of sensitivity and specificity,
      positive predictive value), applicable regulatory requirements (e.g., good clinical practice
      regulations45), and define appropriate credibility factors for Step 5 of the framework.
   • For prospectively planned validation:


44
   See Section 2.5.1 and Section 11.1.4, Oberkampf WL and Roy CJ. Verification and Validation in Scientific
Computing. Cambridge University Press, 2010.
45
   See Regulations: Good Clinical Practice and Clinical Trials at https://www.fda.gov/science-research/clinical-
trials-and-human-subject-protection/regulations-good-clinical-practice-and-clinical-trials

                                                                                                                   37
                                Contains Nonbinding Recommendations


             •   If possible, we recommend considering that the computational analyst(s)
                 performing the simulations be blinded to the validation data to prevent the
                 potential for bias.46
     •   For validation against retrospective datasets:
            • We recommend that you pay special attention to the applicability of validation
                results to the COU, since the comparator data were not designed for validating the
                model for the current COU.
     •   For previously generated validation results:
            • We recommend that you pay special attention to the applicability of previously
                generated validation results to the COU, since the previous validation results were
                not designed to support the model for the current COU. This should include an
                assessment of any differences, and the impact thereof, between the model used in
                the previous validation results and the current model.

Category 5: Population-based evidence
   • Consider quantitatively assessing the closeness of the two populations by comparing
      means, variances, full distributions or using other appropriate statistical methods.
   • We recommend that you provide and compare relevant demographic information,
      anatomy, pathologies, and co-morbidities of the subjects used in: (i) the patient data used
      to generate the virtual cohort; (ii) the clinical dataset used for validation; and (iii) the
      intended patient population.
   • If the evidence comes from a clinical study without subject-level data, we recommend
      that you generate and evaluate the evidence using the appropriate best practices and
      methods (e.g., good clinical practices, appropriate statistical techniques), and define
      appropriate credibility factors for Step 5 of the framework.

Category 6: Emergent model behavior
   • As discussed in Section VI.B, compared to model validation, emergent model behavior is
      generally relatively weak evidence for model credibility because it does not involve
      direct comparison with experimental data. Therefore, we generally do not recommend
      relying on emergent model behavior as a primary source of evidence for model
      credibility, although it may serve as useful secondary evidence.
   • Consider evaluating how important or relevant the emergent behavior is to the COU and
      explaining why the model reproducing the emergent behavior provides confidence in the
      model for the COU.
   • For Step 5 of the framework, we recommend that you define credibility factors for the
      relevance of the emergent behavior to the COU, sensitivity of emergent behavior to
      model input uncertainty, and others.

Category 7: Model plausibility
   • As discussed in Section VI.B, compared to model validation, model plausibility is
      generally a relatively weak argument for model credibility because it does not involve
      testing the model predictions. Therefore, if model plausibility evidence is the primary

46
  See Section 2.5.1 and Section 11.1.4, Oberkampf WL and Roy CJ. Verification and Validation in Scientific
Computing. Cambridge University Press, 2010.

                                                                                                             38
                           Contains Nonbinding Recommendations


       credibility evidence presented, you should provide a rationale for why validation testing
       of the model is not possible or warranted, for example, referring to the assessed model
       risk.
   •   Consider evaluating how any assumptions impact predictions by comparing results using
       alternative model forms, preferably from higher-fidelity models if possible.
   •   Consider performing uncertainty quantification and sensitivity analysis for the model
       parameters.
   •   For Step 5 of the framework, we recommend using ASME V&amp;amp;V 40 credibility factors
       related to model form and model inputs, as appropriate.

Category 8: Calculation verification/UQ results using COU simulations
   • For calculation verification results: for Step 5 of the framework, we recommend using the
      three calculation verification credibility factors defined in ASME V&amp;amp;V 40.
   • For UQ results: for Step 5 of the framework, we recommend using the model input
      credibility factors defined in ASME V&amp;amp;V 40.
   • If you generate this type of evidence, we recommend incorporating the calculation
      verification and/or UQ results when comparing COU predictions with any decision
      thresholds (as discussed in Section VI.D, 'Adequacy Assessment'), taking into account
      the estimated numerical uncertainty and/or output uncertainty from UQ.


                                                                                              39
                              Contains Nonbinding Recommendations


Appendix 2. Reporting Recommendations for CM&amp;amp;S
Credibility Assessment in Medical Device Submissions
In this Appendix, we provide: (a) recommended information to include when requesting
feedback on a CM&amp;amp;S credibility assessment plan in a Q-submission, and (b) recommendations
for reporting of CM&amp;amp;S credibility assessment in medical device regulatory submissions.

Requesting FDA Feedback on a Credibility Assessment Plan
We recognize that the generalized framework for assessing model credibility may necessitate
interactive feedback from FDA, in particular concerning the model risk assessment and the
prospective adequacy assessment (Steps 3 and 6 in Section V, respectively). Manufacturers who
wish to receive feedback from FDA can receive feedback on any aspect of their computational
modeling and/or credibility assessment using the Q-submission pathway (refer to FDA's
guidance titled "Requests for Feedback and Meetings for Medical Device Submissions: The Q-
Submission Program"47). If requesting feedback on a plan for credibility assessment, we
recommend that you provide information on the preliminary and prospective steps in the
framework outlined in Section V (Steps 1-6). The following provides an example of how the Q-
submission could be organized:

Possible Content to include in a Q-submission on a Credibility Assessment Plan:
   1. Purpose: The overall purpose of the Q-Submission including goals for the outcome of
       the interaction with FDA.
   2. Background: e.g., clinical context or other relevant background information for the
       device.
   3. Device Description
   4. Proposed Indications for Use
   5. Regulatory History
   6. Description of Computational Model
   7. Credibility Assessment Plan
           a. Summary of overall approach
           b. Question of Interest (see Section VI.A.(1))
           c. COU (see Section VI.A.(2))
           d. Model Risk Assessment (see Section VI.A.(3))
           e. Planned Credibility Evidence. For each type of credibility evidence planned,
               provide the following:
                    i. Categorization of evidence per Section VI.B
                   ii. Description of evidence to be collected
                  iii. Chosen credibility factors (see Section VI.C). For each factor, provide:
                          1. Credibility gradation
                          2. Proposed credibility goal (or assessed credibility level for
                              previously generated data)

47
 https://www.fda.gov/regulatory-information/search-fda-guidance-documents/requests-feedback-and-meetings-
medical-device-submissions-q-submission-program

                                                                                                            40
                               Contains Nonbinding Recommendations


                         3. Brief plans for achieving credibility goal
           f. Prospective Adequacy Assessment (see Section VI.D).
     8. Specific Questions for FDA


Recommendations for a Credibility Assessment Report
A Credibility Assessment Report is a self-contained document that can be included as part of a
regulatory submission. The report is intended to provide evidence and the rationale for the
credibility of CM&amp;amp;S used in a medical device regulatory submission.

Below, we provide an example of how a Credibility Assessment Report could be organized. The
outline below only applies to CM&amp;amp;S credibility information and does not provide a
recommended format for information pertaining to the model itself. Moreover, for CM&amp;amp;S used
in in silico device testing or in silico clinical trials (see Section II) the outline does not provide
recommendations for providing the results of the simulation study. For CM&amp;amp;S used for in silico
device testing or in silico clinical trials, refer to FDA's guidance titled "Reporting of
Computational Modeling Studies in Medical Device Submissions"48 (hereafter referred to as
"Computational Modeling Reporting Guidance") for reporting model details and study results. In
this situation, we recommend that you provide two reports: one report describing the model and
study results using the Computational Modeling Reporting Guidance, and a separate "Credibility
Assessment Report" using the outline described below. In the first report, we recommend you
reference your Credibility Assessment Report as appropriate to provide any credibility-related
information recommended by the Computational Modeling Reporting Guidance (i.e., Section III:
Code Verification, Section VIII: System Discretization-Calculation Verification, and Section
X: Validation).

FDA recognizes that the level of detail included in a Credibility Assessment Report will vary and
will depend on the specific discipline, type of computational modeling, and the COU of the
model, among other factors. Because we expect the level of detail to vary for different types of
CM&amp;amp;S, we recommend that your Credibility Assessment Report provide an emphasis on the
rationale/justification used when generating and assessing your credibility evidence. The
following outline may be helpful to organize the content of your Credibility Assessment Report:

Recommended Content for a Credibility Assessment Report:
   1. Executive Summary: Include a brief description of the device, the model, the question
      of interest that the model is used to address, the model COU, the assessed model risk, a
      summary of the categories of the credibility evidence provided, and a summary of the
      adequacy assessment with a brief rationale.
   2. Background: e.g., clinical context or other relevant background for the device. Either
      provide here or refer to another section in the regulatory submission.
   3. Device Description: Include within the report or refer to another section in regulatory
      submission.

48
   https://www.fda.gov/regulatory-information/search-fda-guidance-documents/reporting-computational-modeling-
studies-medical-device-submissions

                                                                                                           41
                        Contains Nonbinding Recommendations


4. Proposed Indications for Use: Include within the report or refer to another section in
    regulatory submission.
5. Description of Computational Model: If model details are included elsewhere in the
    regulatory submission, we recommend referencing accordingly. We recommend
    providing details on governing equations, model parameter values, methods used to
    determine parameter values, numerical methods used for solving the governing equations,
    and other information that could be relevant in evaluating model credibility.
6. Model Credibility Assessment
        a. Summary of overall approach
        b. Question of Interest (see Section VI.A.(1))
        c. COU (see Section VI.A.(2))
        d. Model Risk Assessment (see Section VI.A.(3))
        e. Credibility Evidence. For each type of credibility evidence provided, provide the
           following:
                 i. Categorization of evidence per Section VI.B
                ii. Description of evidence
               iii. Chosen credibility factors (see Section VI.C). For each factor, provide:
                        1. Credibility gradation;
                        2. Prospective credibility goal (if prospectively planned activities) or
                            assessed credibility level (if previously generated data); and
                        3. Achieved credibility level (if prospectively planned activities).
               iv. Methods. Full methods may be provided here, or provided elsewhere (e.g.,
                    in an Appendix to the Credibility Assessment Report or published in a
                    journal article) and referenced here.
                v. Results. As with the methods, full results may be provided here, or
                    provided elsewhere and referenced here.
        f. Post-study Adequacy Assessment (see Section VI.D).
7. Credibility Assessment Limitations
8. Conclusions
9. References
10. Appendices: Detailed descriptions of credibility assessment study methods and results (if
    needed).


                                                                                             42