Extract a REDCap database into a tidy supertibble — read_redcap

Query the REDCap API to retrieve data and metadata about a project, and transform the output into a "supertibble" that contains data and metadata organized into tibbles, broken down by instrument.

read_redcap_tidy(
  redcap_uri,
  token,
  raw_or_label = "label",
  forms = NULL,
  export_survey_fields = TRUE,
  suppress_messages = TRUE
)

Arguments

redcap_uri: The URI/URL of the REDCap server (e.g., "https://server.org/apps/redcap/api/"). Required.
token: The user-specific string that serves as the password for a project. Required.
raw_or_label: A string (either 'raw' or 'label') that specifies whether to export the raw coded values or the labels for the options of categorical fields. Default is 'label'.
forms: A character vector of form names that specifies the forms to export. Default is NULL which returns all forms in the project.
export_survey_fields: A logical that specifies whether to export the survey identifier field (e.g., 'redcap_survey_identifier') or survey timestamp fields [instrument_name]_timestamp. The timestamp data reflect the survey's completion time (according to the time and timezone of the REDCap server.). Default is TRUE.
suppress_messages: A logical to control whether to suppress messages from REDCapR API calls. Default TRUE.

Value

A tibble in which each row represents a REDCap instrument. It contains the following columns:

redcap_form_name, the name of the instrument
redcap_form_label, the label for the instrument
redcap_data, a tibble with the data for the instrument
redcap_metadata, a tibble of data dictionary entries for each field in the instrument
redcap_events, a tibble with information about the arms and longitudinal events represented in the instrument. Only if the project has longitudinal events enabled
structure, the instrument structure, either "repeating" or "nonrepeating"
data_rows, the number of rows in the instrument's data tibble
data_cols, the number of columns in the instrument's data tibble
data_size, the size in memory of the instrument's data tibble computed by lobstr::obj_size()
data_na_pct, the percentage of cells in the instrument's data columns that are NA excluding identifier and form completion columns

Details

This function uses the REDCapR package to query the REDCap API. The REDCap API returns a block matrix that mashes data from all data collection instruments together. The read_redcap_tidy() function deconstructs the block matrix and splices the data into individual tibbles, where one tibble represents the data from one instrument.

Examples

if (FALSE) {
redcap_uri <- Sys.getenv("REDCAP_URI")
token <- Sys.getenv("REDCAP_TOKEN")

read_redcap_tidy(
   redcap_uri,
   token,
   raw_or_label = "label"
 )
}