This function generates raw data for study snapshots based on provided participant count,
site count, study ID, and snapshot count. If any of these parameters are NULL, it loads
a template CSV file to iterate over multiple study configurations. The generated raw data
is saved as an RDS file.
Usage
raw_data_generator(
ParticipantCount = NULL,
SiteCount = NULL,
StudyID = NULL,
SnapshotCount = NULL,
SnapshotWidth = NULL,
template_path = system.file("template.csv", package = "gsm.datasim"),
workflow_path = "workflow/1_mappings",
generate_reports = FALSE,
mappings = NULL,
package = "gsm.mapping",
strStartDate = "2012-01-01",
save = FALSE,
generation_mode = c("core", "legacy"),
lWorkflows = NULL
)Arguments
- ParticipantCount
An integer specifying the number of participants for the study. If
NULL, the function will use values from the template file.- SiteCount
An integer specifying the number of sites for the study. If
NULL, the function will use values from the template file.- StudyID
A string specifying the study identifier. If
NULL, the function will use values from the template file.- SnapshotCount
An integer specifying the number of snapshots for the study. If
NULL, the function will use values from the template file.- SnapshotWidth
A character specifying the frequency of snapshots, defaults to "months". Accepts "days", "weeks", "months" and "years". User can also place a number and unit such as "3 months".
- template_path
A string specifying the path to the template CSV file. Default is
"~/gsm.datasim/inst/template.csv".- workflow_path
A string specifying the path to the workflow mappings. Default is
"workflow/1_mappings".- generate_reports
A boolean, specifying whether or not to produce reports upon execution. Default is FALSE.
- mappings
A string specifying the names of the workflows to run.
- package
A string specifying the package in which the workflows used in
MakeWorkflowList()are located. Default is "gsm".- strStartDate
A string to denote when the first snapshot of simulated data occurs
- save
A boolean, specifying whether or not this should be saved out as an RDS
- generation_mode
Generation backend to use: "core" (default) or "legacy".
- lWorkflows
Optional named list of workflow objects (from
gsm.core::MakeWorkflowList()). When supplied the function routes togenerate_data_from_workflows()to produce a single-snapshot dataset driven entirely by the workflow spec.SnapshotCountandSnapshotWidthare ignored in this mode.
Value
A list of raw data generated for each study snapshot, saved as an RDS file in "data-raw/raw_data.RDS".
Details
The function performs the following steps:
If
lWorkflowsis provided, delegates togenerate_data_from_workflows()to produce a spec-driven single snapshot.If
ParticipantCount,SiteCount,StudyID, orSnapshotCountisNULL, the function reads thetemplate.csvfile to get the necessary parameters for multiple studies.It generates raw data for study snapshots based on either provided parameters or the template file.
The generated data is saved as an RDS file and returned as a list.
Examples
if (FALSE) { # \dontrun{
# Generate raw data using specified parameters
data <- raw_data_generator(
ParticipantCount = 100,
SiteCount = 10,
StudyID = "Study01",
SnapshotCount = 5,
SnapshotWidth = "months"
)
# Generate raw data using a template file
data <- raw_data_generator()
# Generate raw data from workflows
wf <- gsm.core::MakeWorkflowList(
strPath = "workflow/1_mappings",
strPackage = "gsm.mapping"
)
data <- raw_data_generator(lWorkflows = wf, ParticipantCount = 200, SiteCount = 20)
} # }