Skip to contents

Introduction

Qualification for this repository is done to ensure that the package is functioning as intended and that core functions execute as expected on a system-wide scale. While unit tests should also be written to test the code, qualification testing is used to test that the expected behaviors are happening correctly. Qualification is done using a set of machine-readable documents and associated functions to create a strong documentation structure as well as a cohesive qualification report. This qualification process will be modified to add new assessments and should be updated whenever there are updates that affect the workflows tested by qualification tests. Qualification tests are designed to provide developers with a repeatable process that is easy to update and document.

Process Overview

Each GSM assessment is independently qualified using Specifications and Test Cases, which are then compiled into a Qualification Report.

  • Specifications - the expected behaviors that are being tested.
  • Test Cases - testable pieces of code associated with Specifications.
  • Qualification Report - Summary snapshot of all qualification activity.

Specifications

Specifications should capture the most important use cases for a given function. Each function must have at least one (1) specification, and each specification must have at least one (1) associated test case. Multiple specifications may exist for a function, and multiple test cases may exist for a specification.

Each Specification should include the following components:

  • Description - Outlines the use case for the specification.
  • Risk Assessment - An evaluation of risk for the the use case. Includes 2 components:
    • Risk Level - Risk Level can be “Low,” “Medium,” or “High,” corresponding to the risk associated with the specification failing.
    • Risk Impact - Risk Impact can be “Low,” “Medium,” or “High,” corresponding to the severity of the impact associated with the specification failing.
  • Test Cases - A list of test cases associated with the specification.

The specifications (including Description, Risk Level, and Risk Impact) should be documented in qualification_specs.csv, to be rendered by the Qualification Report, documented in a later section. For example, the first specification from qualification_specs.csv is written as:

Spec ID Spec Description Risk Impact Associated Test IDs
S1_1 Given raw participant-level data, all necessary data.frame transformations are made to create input data for all workflows High High T1_1
S2_1 Given raw participant-level data, a properly specified Workflow for a KRI creates summarized and flagged data High High T2_1
S2_2 Given raw participant-level data with missingness, a properly specified Workflow for a KRI creates summarized and flagged data High High T2_2
S3_1 Given pre-processed input data, a properly specified Workflow for a KRI creates summarized and flagged data High High T3_1
S4_1 Given appropriate metadata (i.e. vThresholds), flagged observations are properly marked in summary data High High T4_1
S4_2 Given appropriate metadata (i.e. vThresholds), data.frame of bounds can be created High High T4_2
S5_1 Given appropriate raw participant-level data, flag values can be correctly assigned to records that meet flagging criteria, including custom thresholding. High High T5_1
S5_2 Given appropriate raw participant-level data, flag values are correctly assigned as NA for sites with low enrollment. High High T5_2
S6_1 Given appropriate raw participant-level data, an Adverse Event Assessment can be done using the Normal Approximation method. High High T6_1
S6_2 Adverse Event Assessments can be done correctly using a grouping variable, such as Site or Country for KRIs, and Study for QTLs, when applicable. High High T6_2
S7_1 Given appropriate raw participant-level data, a Protocol Deviation Assessment can be done using the Normal Approximation method. High High T7_1
S7_2 Protocol Deviation Assessments can be done correctly using a grouping variable, such as Site or Country for KRIs, and Study for QTLs, when applicable. High High T7_2
S8_1 Given appropriate raw participant-level data, a Dispositon Assessment can be done using the Normal Approximation method. High High T8_1
S8_2 Disposition Assessments can be done correctly using a grouping variable, such as Site or Country for KRIs, and Study for QTLs, when applicable. High High T8_2
S9_1 Given appropriate raw participant-level data, a Labs Assessment can be done using the Normal Approximation method. High High T9_1
S9_2 Labs Assessments can be done correctly using a grouping variable, such as Site or Country for KRIs, and Study for QTLs, when applicable. High High T9_2
S10_1 Given appropriate raw participant-level data, a Data Change Rate Assessment can be done using the Normal Approximation method. High High T10_1
S10_2 Data Change Rate Assessments can be done correctly using a grouping variable, such as Site, Country, or Study, when applicable. High High T10_2
S11_1 Given appropriate raw participant-level data, a Data Entry Lag Assessment can be done using the Normal Approximation method. High High T11_1
S11_2 Data Entry Lag Assessments can be done correctly using a grouping variable, such as Site, Country, or Study, when applicable. High High T11_2
S12_1 Given appropriate raw participant-level data, a Query Age Assessment can be done using the Normal Approximation method. High High T12_1
S12_2 Query Age Assessments can be done correctly using a grouping variable, such as Site, Country, or Study, when applicable. High High T12_2
S13_1 Given appropriate raw participant-level data, a Query Rate Assessment can be done using the Normal Approximation method. High High T13_1
S13_2 Query Rate Assessments can be done correctly using a grouping variable, such as Site, Country, or Study, when applicable. High High T13_2

Test Cases

Test cases translate specifications into testable scripts to confirm that the package functions meet the established requirements. Test cases should be representative of how a user may utilize the function to help identify code gaps and support testing automation. Test cases are linked to the assessment as shown above.

Test cases are written using the standard testthat workflow and saved in tests/testqualification/qualification. Each test case should be saved as an individual file and named using the convention test_qual_{TestID}.R, where TestID corresponds to the test case number. Test code within these scripts should be written clearly and concisely to facilitate quick execution, review, and interpretation. Test cases should also have an informative description to outline what is being tested.

Note that test cases can be associated with multiple specifications. For example, S1_1 includes tests (T1_1, T1_2, T1_3) that the AE_Assess function is performed properly for the Poisson method. Each of these tests (T1_1, T1_2, T1_3) tests whether the Poisson method output is accurate when the data is grouped by a different grouping variable (Site, Study, Custom, respectively). In addition, the input data for T1_1 and T1_2 are a subset of a larger dataframe, and thus T1_1 and T1_2 also test whether the AE_Assess function performs appropriately when provided a subset of the input data, which satisfies spec S1_6.

An simple example test case is shown below. For this test case the file would be called test_qual_T21_1.R and would correspond to T21_1 in the specifications:

Qualification Report

The Qualification Report is generated to document and display the qualification that the code has been through. The report lives as a Qualification vignette in gsm and is rendered during other workflows. The Qualification Report is also attached to each release and included in the pkgdown site to display the qualification status of gsm. The sections of the Qualification Report are outlined below.

Qualification Testing Results

Using the specifications, test cases, and test code outlined above the qualification status of all assessments currently qualified within gsm is rendered, consisting of smaller sections for each assessment. These smaller sections will include the procedure that is being qualified, which should correspond to the function that is used for that procedure. An overview of the specifications is also included that has the ID, Description, Risk Level, Risk Impact, and associated test cases corresponding to each specification. This information is pulled from the Specification Spreadsheet file (qualification_specs.csv) outlined above.

Test Results: Overview

An overview of the qualification test results is presented as a table, with one row for each function that has been tested. The results are presented as a series of columns for the number of tests, number of passing tests, number of failing tests, and number of skipped tests.

Function Name Number of Tests Number Passed Number Failed Number Skipped
Adverse Event Assessment 52 52 0 0
Analysis workflow 22 21 0 0
Analyze_NormalApprox_PredictBounds 19 19 0 0
Data Change Rate Assessment 19 19 0 0
Data Entry Lag Assessment 19 19 0 0
Disposition Assessment 35 35 0 0
Flag_NormalApprox 7 7 0 0
Labs Assessment 19 19 0 0
Mapping workflow 46 46 0 0
Protocol Deviation Assessment 30 30 0 0
Query Age Assessment 35 35 0 0
Query Rate Assessment 35 35 0 0
Summarize 16 16 0 0
Test Results: Detailed

A detailed summary of the qualification test results is also provided in table format in the Qualification Report. In this section, two tables are presented, where the first presents each row as corresponding to a single specification and the second presents each row as corresponding to a single test.

  • One Row Per Specification - Each row corresponds to a specification, and each specification is presented with a general description of the functionality tested for each specification, along with risk level, risk impact, and associated test IDs. In most cases, there are multiple test IDs associated with each specification.
Spec ID Spec Description Risk Impact Associated Test IDs
S1_1 Given raw participant-level data, all necessary data.frame transformations are made to create input data for all workflows High High T1_1
S2_1 Given raw participant-level data, a properly specified Workflow for a KRI creates summarized and flagged data High High T2_1
S2_2 Given raw participant-level data with missingness, a properly specified Workflow for a KRI creates summarized and flagged data High High T2_2
S3_1 Given pre-processed input data, a properly specified Workflow for a KRI creates summarized and flagged data High High T3_1
S4_1 Given appropriate metadata (i.e. vThresholds), flagged observations are properly marked in summary data High High T4_1
S4_2 Given appropriate metadata (i.e. vThresholds), data.frame of bounds can be created High High T4_2
S5_1 Given appropriate raw participant-level data, flag values can be correctly assigned to records that meet flagging criteria, including custom thresholding. High High T5_1
S5_2 Given appropriate raw participant-level data, flag values are correctly assigned as NA for sites with low enrollment. High High T5_2
S6_1 Given appropriate raw participant-level data, an Adverse Event Assessment can be done using the Normal Approximation method. High High T6_1
S6_2 Adverse Event Assessments can be done correctly using a grouping variable, such as Site or Country for KRIs, and Study for QTLs, when applicable. High High T6_2
S7_1 Given appropriate raw participant-level data, a Protocol Deviation Assessment can be done using the Normal Approximation method. High High T7_1
S7_2 Protocol Deviation Assessments can be done correctly using a grouping variable, such as Site or Country for KRIs, and Study for QTLs, when applicable. High High T7_2
S8_1 Given appropriate raw participant-level data, a Dispositon Assessment can be done using the Normal Approximation method. High High T8_1
S8_2 Disposition Assessments can be done correctly using a grouping variable, such as Site or Country for KRIs, and Study for QTLs, when applicable. High High T8_2
S9_1 Given appropriate raw participant-level data, a Labs Assessment can be done using the Normal Approximation method. High High T9_1
S9_2 Labs Assessments can be done correctly using a grouping variable, such as Site or Country for KRIs, and Study for QTLs, when applicable. High High T9_2
S10_1 Given appropriate raw participant-level data, a Data Change Rate Assessment can be done using the Normal Approximation method. High High T10_1
S10_2 Data Change Rate Assessments can be done correctly using a grouping variable, such as Site, Country, or Study, when applicable. High High T10_2
S11_1 Given appropriate raw participant-level data, a Data Entry Lag Assessment can be done using the Normal Approximation method. High High T11_1
S11_2 Data Entry Lag Assessments can be done correctly using a grouping variable, such as Site, Country, or Study, when applicable. High High T11_2
S12_1 Given appropriate raw participant-level data, a Query Age Assessment can be done using the Normal Approximation method. High High T12_1
S12_2 Query Age Assessments can be done correctly using a grouping variable, such as Site, Country, or Study, when applicable. High High T12_2
S13_1 Given appropriate raw participant-level data, a Query Rate Assessment can be done using the Normal Approximation method. High High T13_1
S13_2 Query Rate Assessments can be done correctly using a grouping variable, such as Site, Country, or Study, when applicable. High High T13_2
  • One Row Per Test - Each row corresponds to a single test (Test ID), and each test is presented with the function which is tested, the specification IDs the test satisfies, the detailed test description (including grouping variables and other function arguments that are tested), and the result of the test (Pass/Fail/Skip).
Function Spec ID Test ID Test Description Test Result
Mapping workflow S1_1 T1_1 mappings now done by individual domain, test that inputs and outputs of priority 1 mappings are completed as expected Pass
Mapping workflow S1_1 T1_1 mappings now done by individual domain, test that inputs and outputs of priority 2 mappings are completed as expected Pass
Mapping workflow S1_1 T1_1 mappings now done by individual domain, test that inputs and outputs of priority 3 mappings are completed as expected Pass
Analysis workflow S2_1 T2_1 Given raw participant-level data, a properly specified Workflow for a KRI creates summarized and flagged data Pass
Analysis workflow S2_2 T2_2 Given raw participant-level data with missingness, a properly specified Workflow for a KRI creates summarized and flagged data Pass
Analysis workflow S3_1 T3_1 Given pre-processed input data, a properly specified Workflow for a KRI creates summarized and flagged data Pass
Flag_NormalApprox S4_1 T4_1 Given appropriate metadata (i.e. vThresholds), flagged observations are properly marked in summary data Pass
Analyze_NormalApprox_PredictBounds S4_2 T4_2 Given appropriate metadata (i.e. vThresholds), bounds are properly applied to generate flags Pass
Summarize S5_1 T5_1 Given appropriate raw participant-level data, flag values can be correctly assigned to records that meet flagging criteria, including custom thresholding. Pass
Summarize S5_2 T5_2 Given appropriate raw participant-level data, flag values are correctly assigned as NA for sites with low enrollment. Pass
Adverse Event Assessment S6_1 T6_1 Given appropriate raw participant-level data, an Adverse Event Assessment can be done using the Normal Approximation method. Pass
Adverse Event Assessment S6_2 T6_2 Adverse Event Assessments can be done correctly using a grouping variable, such as Site or Country for KRIs, and Study for QTLs, when applicable. Pass
Protocol Deviation Assessment S7_1 T7_1 Given appropriate raw participant-level data, a Protocol Deviation Assessment can be done using the Normal Approximation method. Pass
Protocol Deviation Assessment S7_2 T7_2 Protocol Deviation Assessments can be done correctly using a grouping variable, such as Site or Country for KRIs, and Study for QTLs, when applicable. Pass
Disposition Assessment S8_1 T8_1 Given appropriate raw participant-level data, a Dispositon Assessment can be done using the Normal Approximation method. Pass
Disposition Assessment S8_2 T8_2 Disposition Assessments can be done correctly using a grouping variable, such as Site or Country for KRIs, and Study for QTLs, when applicable. Pass
Labs Assessment S9_1 T9_1 Given appropriate raw participant-level data, a Labs Assessment can be done using the Normal Approximation method. Pass
Labs Assessment S9_2 T9_2 Labs Assessments can be done correctly using a grouping variable, such as Site or Country for KRIs, and Study for QTLs, when applicable. Pass
Data Change Rate Assessment S10_1 T10_1 Given appropriate raw participant-level data, a Data Change Rate Assessment can be done using the Normal Approximation method. Pass
Data Change Rate Assessment S10_2 T10_2 Data Change Rate Assessments can be done correctly using a grouping variable, such as Site, Country, or Study, when applicable. Pass
Data Entry Lag Assessment S11_1 T11_1 Given appropriate raw participant-level data, a Data Entry Lag Assessment can be done using the Normal Approximation method. Pass
Data Entry Lag Assessment S11_2 T11_2 Data Entry Lag Assessments can be done correctly using a grouping variable, such as Site, Country, or Study, when applicable. Pass
Query Age Assessment S12_1 T12_1 Given appropriate raw participant-level data, a Query Age Assessment can be done using the Normal Approximation method. Pass
Query Age Assessment S12_2 T12_2 Query Age Assessments can be done correctly using a grouping variable, such as Site, Country, or Study, when applicable. Pass
Query Rate Assessment S13_1 T13_1 Given appropriate raw participant-level data, a Query Rate Assessment can be done using the Normal Approximation method. Pass
Query Rate Assessment S13_2 T13_2 Query Rate Assessments can be done correctly using a grouping variable, such as Site, Country, or Study, when applicable. Pass

Unit Tests

A summary of the unit test coverage is included in the Qualification Report to show how well the package functions are unit tested. This is created by using covr::package_coverage() and then listed out by function. Unit testing is performed in addition to qualification testing to help ensure that individual pieces of code within the R package function correctly and produce the expected results. By testing individual units of code in isolation, developers can identify and fix issues early in the development process before more significant and scaled problems arise.

Qualification Testing Environment

The sessionInfo() of the qualification environment is included to show what R version, platform, and packages were used when running the Qualification Report. This is called after all necessary packages have been loaded and all setup is done. The environment should not change after this part of the report is created. In addition, a package list is provided, which includes the package version and package score from riskmetric, which quantifies the robustness of an R package. The pkg_score column captures the risk involved with using a package. The risk level ranges from 0 (low risk) to 1 (high risk).

Pull Requests

The final section of the Qualification Report is an overview of all Pull Requests since the last release. This includes the title, compare and base branches, a link to the GitHub page, requester, reviewers, date requested, and the status of the Pull Request. While this is meant to be a comprehensive overview of the Pull Requests the release documentation should also include links to all Pull Requests included in the release.