Add columns flagging sites that represent possible statistical outliers when the Identity statistical method is used.
Arguments
- dfAnalyzed
data.frame
where flags should be added.- strColumn
character
Name of the column to use for thresholding. Default:"Score"
- vThreshold
numeric
Vector of 2 numeric values representing lower and upper threshold values. All values instrColumn
are compared tovThreshold
using strict comparisons. Values less than the lower threshold or greater than the upper threshold are flagged. Values equal to the threshold values are set to 0 (i.e., not flagged). If NA is provided for either threshold value, it is ignored and no values are flagged based on the threshold. NA and NaN values instrColumn
are given NA flag values.- strValueColumn
character
Name of the column to use for sign ofFlag.
If the value for that row is higher than the median ofstrValueColumn
, thenFlag
is set to 1. Similarly, if the value for that row is lower than the median ofstrValueColumn
, then Flag is set to -1.
Details
This function provides a generalized framework for flagging sites as part of the
GSM data model (see vignette("DataModel")
).
Data Specification
Flag
is designed to support the input data (dfAnalyzed
) from the Analyze_Identity()
function. At a minimum, the input data must have a strGroupCol
parameter and a numeric
strColumn
parameter defined. strColumn
will be compared to the specified thresholds in
vThreshold
to define a new Flag
column, which identifies possible statistical outliers. If a
user would like to see the directionality of those identified points, they can define the
strValueColumn
parameter, which will assign a positive or negative indication to already
flagged points.
The following columns are considered required:
GroupID
- Group ID; default isSiteID
GroupLevel
- Group TypestrColumn
- A column to use for thresholding
The following column is considered optional:
strValueColumn
- A column to be used for the sign/directionality of the flagging
Examples
dfTransformed <- Transform_Count(analyticsInput, strCountCol = "Numerator")
dfAnalyzed <- Analyze_Identity(dfTransformed)
#> `Score` column created from `Metric`.
dfFlagged <- Flag(dfAnalyzed, vThreshold = c(0.001, 0.01))