Skip to contents

The ggswim package eases the development of swimmer plots in R through integration with the ggplot2 framework. In this vignette, we’ll walk through how users can create visually striking swimmer plots.

At the heart of ggswim are two core functions: ggswim() and add_marker(). The former, an extension of geom_segment(), allows users to construct the horizontal bars, what we’ll sometimes refer to as “lanes.” Meanwhile, add_marker() wraps both geom_point() and geom_label() to effortlessly embed key events, or “markers,” onto the lanes. These can take the form of unicode shapes, symbols, or even emojis.

Drawing from the well-established principles of ggplot2, ggswim allows users to apply familiar layer-building techniques, including the application of styles and themes.

Building swimmer plots with aesthetic mapping

To start, let’s build a swimmer plot using aes() and aesthetic mapped data (for fixed mapping, see Fixed Marker Mapping). As with the README, we will be using ggswim’s internal datasets: patient_data, infusion_events, and end_study_events.

Let’s start with observing patient_data’s structure:

patient_data contains a long pivoted dataset where patient ID’s (pt_id) can be repeated. These rows are differentiated by disease_assessmentss combined with corresponding start and end times, representing months. Together, these rows signify a patient survival timeline.

  • disease_assessment is broken down by the following nomenclature along with some indicator of B-cell status (if applicable):
    • CR = “Complete Response”
    • CRi = “Complete Response with Incomplete Blood Count Recovery”
    • RD = “Relapsed Disease”

Now, let’s make the plot using ggswim():

p <- ggswim(
  patient_data,
  mapping = aes(
    x = start_time,
    xend = end_time,
    y = pt_id,
    color = disease_assessment
  ),
  linewidth = 5
)

p

Here we have a simple line graph, showing infusions grouped by patients with a given disease assessment status. ggswim() does the work of setting up a geom_segment() and readying our plot layers for additional ggswim-specific features such as the “markers” mentioned earlier. Let’s add a marker layer next from the infusion_events dataset:

infusion_events |>
  rmarkdown::paged_table()

This dataset is much simpler, having only 3 columns indicating the time from the initial infusion, where 0 is an initial infusion and reinfusions at some point beyond 0 (if a patient had any). These are also categorized under infusion_type.

In order to separate layers of identical types, in this case colors, we need a way to separate out the legend by lanes and markers. By default, ggplot2 will group these together, but by using new_scale_color() from the ggnewscale package we can specify that we want to actually keep these in their own scales. new_scale_color() can be used to make any number of separations you need, just be sure to call a separation at the top of each marker call and include any additional scale changes before moving on to the next one.

Because new_scale_color() establishes the start of a new scale, it is important to define any scale changes we want before calling it and moving on to the next one. It is a common workflow to use scale_color_manual() to change and update the colors and names of the visible elements in both the plot and the legend, so we’ll apply it here, just after our swim lane layer but before moving on to adding the markers.

p <- p +
  scale_color_manual(
    name = "Overall Disease Assessment",
    values = c("#6394F3", "#F3C363", "#EB792F", "#d73a76", "#85a31e")
  ) +
  new_scale_color() +
  add_marker(
    data = infusion_events,
    aes(
      x = time_from_initial_infusion,
      y = pt_id,
      color = infusion_type,
      shape = infusion_type
    ),
    size = 5
  )

p

Now we can see shapes have been added to establish our markers for infusions.

Our last dataset involves end of study events, i.e. events that indicate a patient has left the study for various reasons.

end_study_events |>
  rmarkdown::paged_table()

You’ll notice that this dataset includes use of emojis under end_study_label. In addition to shapes and symbols, ggswim supports the use of emojis when rendering swimmer plots. add_marker() allows users to specify shapes or emojis by using the appropriate aes() argument callouts: label_vals and label_names. These are unique mapping parameters to ggswim, letting it support and wrap both geom_point() and geom_label() at the same time. Let’s add a layer using emojis via the end_study_events dataset:

p <- p +
  add_marker(
    data = end_study_events,
    aes(
      x = time_from_initial_infusion,
      y = pt_id,
      label_vals = end_study_label,
      label_names = end_study_name
    ),
    label.size = NA, fill = NA, size = 5
  )

p

We’ve successfully made a swimmer plot with lanes and two different kinds of marker layers. Recall that ggswim works within the ggplot2 framework, therefore customization can be done using the same ggplot2 theme and styling functions users may already be familiar with. It is typical in this workflow to call scale_color_manual() to fix up the names and colors of the plot, but be sure to do so in the proper order with the use of new_scale_color() mentioned earlier!

Let’s fix up this plot so it looks nicer on the eyes:

library(ggplot2)

p <- p +
  theme_minimal() +
  scale_colour_manual(
    name = "Markers",
    values = c(NA, NA, "#25DA6D", NA, "firebrick")
  ) +
  labs(title = "My Swimmer Plot") +
  xlab("Time (Months)") + ylab("Patient ID")

p

We can also apply the theme_ggswim() function:

Additional notes

Some additional considerations to keep in mind when working with ggswim:

  • Handling Missing Data: ggswim does not support missing data for mapping aesthetics. If any are detected, developers will receive a warning, and the missing data may appear as NA values in the display, but will be excluded from the legend.

  • Rendering Emojis and Custom Shapes: To ensure emojis and other custom shapes display correctly, users may need to switch their graphics rendering device to AGG.