REDCapTidieR 1.0.0 & JOSS Publication 🎉

Open Source
REDCap
REDCapTidieR
R
REDCapTidieR v1.0.0 Release and JOSS Publication Announcement
Author

Stephan Kadauke, Ezra Porter, Richard Hanna

Published

November 15, 2023

We’re a bit overdue but positively delighted to announce that the v1.0.0 version of REDCapTidieR has been released on CRAN! And we published our paper “REDCapTideR: Extracting complex REDCap databases into tidy tables” in The Journal of Open Source Software (JOSS)!

You can install the package from CRAN with:

install.packages("REDCapTidieR")

Development Journey

REDCapTidieR arose from a seemingly simple requirement to compile data from multiple REDCap projects of CAR-T cell clinical trials into a dynamic analysis and visualization tool. We quickly realized, as many before us had, that when you start using longitudinal databases with repeating instruments and events, this can become extremely cumbersome. The block matrix format which squeezes the data from all instruments into one ugly table with varying granularity and lots of empty fields is notoriously difficult to work with. No real solutions existed for handling complex REDCap databases in R that suited our needs.

To write the package, we had to make some core decisions up front. Our team embraces tidy data principles, and so we wanted to ensure that our package gels well with tidy rectangular data structures. A major issue with data from complex databases is that that the granularity may not be uniform if some data is variable on a per-record, per-instance, or per-event level. This fundamentally violates the tidy data principle that requires that each table row represents an observation of the same thing. However, within each instrument, the granularity is uniform. We took advantage of this fact and broke down the block matrix into tables, one for each instrument. And along the way we made some automated data transformations that make it easier to work with the data in the R programming environment.

We wanted a way to not only break out instruments into tidy tables, but also to make them accessible using an easy-to-use superstructure. So we created the supertibble which provides a handy overview of the instruments of the REDCap project and allows drilling down into the individual data tables.

Labelled Supertibble

Supertibble

Once we had ironed out the data structure, we wrote a set of utility functions to allow users to efficiently work with their exported databases, including labelling their data with the labelled package, exporting their data to Excel with openxlsx2, and getting summary statistics with skimr.

JOSS Publication

REDCapTidieR has become a useful tool to many analysts working in REDCap and R. To showcase our commitment to the package and its development we published “REDCapTideR: Extracting complex REDCap databases into tidy tables” in The Journal of Open Source Software. We believe that REDCapTidieR can cut down on thousands of analyst hours spent doing tedious repetitive analytic work, and so we hope that this brief technical paper will extend the reach of our package.

Acknowledgements

REDCapTidieR is the first open source package our team has made available to the community. To date, we are in awe that we have surpassed over five thousand downloads and see that number grow every day. The success of it wouldn’t be possible without the support and collaboration of the those who helped with its development, opened issues, and contributed to discussions: @camcaan, @JanMarvin, @matthieu-faron, @olivroy, @pwildenhain, @tschuler, @wibeasley.

Next Steps

We aren’t done developing REDCapTidieR, and have some ideas in store for the next releases of it including:

  • Codebook support
  • Allowance for instruments that are both repeating and non-repeating
  • Additional functions for extracting REDCap project information

Have a feature you want to see? Please open an issue and let us know how we can continue making this package even better!