
Clean and Prepare Excerpts Data from Excel File
clean_data.Rd
This function reads an Excel file containing excerpt data, cleans and processes it by:
Reading all columns as text to avoid type guessing issues,
Dropping columns whose names end with "Range" or "Weight",
Converting code columns (those starting with "Code: ") from text to logical, interpreting "true" (case insensitive) as TRUE, otherwise FALSE,
Renaming code columns by removing the prefix "Code: " and suffix " Applied",
Filtering the data to keep only one preferred coder per
Media Title
based on the provided order.
Value
A cleaned tibble/data frame containing filtered excerpts with logical code columns and only preferred coders per media title.
Details
The function expects columns starting with "Code: " to contain textual "true"/"false" values,
which are converted to logical TRUE/FALSE. Columns ending with "Range" or "Weight" are removed.
Excerpts are filtered so that for each Media Title
, only the coder highest in the
preferred_coders
vector is retained.
Examples
if (FALSE) { # \dontrun{
preferred <- c("Coder1", "Coder2", "Coder3")
cleaned_data <- clean_data("path/to/excerpts.xlsx", preferred)
} # }