
Plot Code Saturation by Quality Indicators
plot_saturation.Rd
Creates a horizontal stacked or dodged bar plot visualizing counts or proportions of quality indicator annotations per code. Only codes that have all specified quality indicators present at least once (count > 0) are shown.
Usage
plot_saturation(
df_all_summary,
df_qual_summary,
qual_indicators = NULL,
min_counts = NULL,
stacked = TRUE,
as_proportion = FALSE
)
Arguments
- df_all_summary
A data frame (tibble) summarizing codes, must contain at least
Code
andtotal_preferred_coder
columns.- df_qual_summary
A data frame (tibble) containing quality indicator counts per code. Must have columns named exactly as in
qual_indicators
, and aCode
column.- qual_indicators
A character vector of quality indicator names to plot (e.g.,
c("Priority excerpt", "Heterogeneity")
). These determine which columns to use and filter on.- min_counts
Optional named numeric vector specifying minimum counts for each quality indicator to include a code (e.g.,
c("Priority excerpt" = 20, "Heterogeneity" = 30)
). Codes with counts below these thresholds for the respective quality indicators will be excluded.- stacked
Logical; if
TRUE
(default), bars for quality indicators will be stacked; ifFALSE
, bars will be dodged (side-by-side).- as_proportion
Logical; if
TRUE
, the y-axis will represent proportions of counts per code rather than raw counts.
Value
A ggplot
object displaying the counts or proportions of quality
indicator annotations by code.
Details
The function filters to only display codes that have counts greater than zero for all specified quality indicators.
The plot orders codes by descending total counts from
total_preferred_coder
.Colors are generated with a discrete gradient palette for visual clarity.
Input data frames should be outputs from
summarize_codes()
andquality_indicators()
functions or have equivalent structure.
Examples
if (FALSE) { # \dontrun{
summary_data <- summarize_codes(excerpts, preferred_coders,
output_type = "tibble")
quality_data <- quality_indicators(excerpts, preferred_coders,
qual_indicators = c("Priority excerpt", "Heterogeneity"))
plot_saturation(
summary_data,
quality_data,
qual_indicators = c("Priority excerpt", "Heterogeneity"),
min_counts = c("Priority excerpt" = 3, "Heterogeneity" = 3),
stacked = TRUE,
as_proportion = FALSE
)
} # }