The ggswim package eases the development of swimmer plots in R through integration with the ggplot2 framework. In this vignette, we’ll walk through how users can create visually striking swimmer plots.
At the heart of ggswim are two core functions: ggswim()
and add_marker()
. The former, an extension of
geom_segment()
, allows users to construct the horizontal
bars, what we’ll sometimes refer to as “lanes.” Meanwhile,
add_marker()
wraps both geom_point()
and
geom_label()
to effortlessly embed key events, or
“markers,” onto the lanes. These can take the form of unicode shapes,
symbols, or even emojis.
Drawing from the well-established principles of ggplot2, ggswim allows users to apply familiar layer-building techniques, including the application of styles and themes.
Building swimmer plots with aesthetic mapping
To start, let’s build a swimmer plot using aes()
and
aesthetic mapped data (for fixed mapping, see Fixed
Marker Mapping). As with the README, we will be using ggswim’s
internal datasets: patient_data
,
infusion_events
, and end_study_events
.
Let’s start with observing patient_data
’s structure:
patient_data
contains a long pivoted dataset where
patient ID’s (pt_id
) can be repeated. These rows are
differentiated by disease_assessments
s combined with
corresponding start and end times, representing months. Together, these
rows signify a patient survival timeline.
-
disease_assessment
is broken down by the following nomenclature along with some indicator of B-cell status (if applicable):- CR = “Complete Response”
- CRi = “Complete Response with Incomplete Blood Count Recovery”
- RD = “Relapsed Disease”
Now, let’s make the plot using ggswim()
:
p <- ggswim(
patient_data,
mapping = aes(
x = start_time,
xend = end_time,
y = pt_id,
color = disease_assessment
),
linewidth = 5
)
p
Here we have a simple line graph, showing infusions grouped by
patients with a given disease assessment status. ggswim()
does the work of setting up a geom_segment()
and readying
our plot layers for additional ggswim-specific features such as the
“markers” mentioned earlier. Let’s add a marker layer next from the
infusion_events
dataset:
infusion_events |>
rmarkdown::paged_table()
This dataset is much simpler, having only 3 columns indicating the
time from the initial infusion, where 0 is an initial infusion and
reinfusions at some point beyond 0 (if a patient had any). These are
also categorized under infusion_type
.
In order to separate layers of identical types, in this case
color
s, we need a way to separate out the legend by lanes
and markers. By default, ggplot2 will group these together, but by using
new_scale_color()
from the ggnewscale package we can
specify that we want to actually keep these in their own scales.
new_scale_color()
can be used to make any number of
separations you need, just be sure to call a separation at the top of
each marker call and include any additional scale changes before moving
on to the next one.
Because new_scale_color()
establishes the start of a new
scale, it is important to define any scale changes we want before
calling it and moving on to the next one. It is a common workflow to use
scale_color_manual()
to change and update the colors and
names of the visible elements in both the plot and the legend, so we’ll
apply it here, just after our swim lane layer but before moving on to
adding the markers.
p <- p +
scale_color_manual(
name = "Overall Disease Assessment",
values = c("#6394F3", "#F3C363", "#EB792F", "#d73a76", "#85a31e")
) +
new_scale_color() +
add_marker(
data = infusion_events,
aes(
x = time_from_initial_infusion,
y = pt_id,
color = infusion_type,
shape = infusion_type
),
size = 5
)
p
Now we can see shapes have been added to establish our markers for infusions.
Our last dataset involves end of study events, i.e. events that indicate a patient has left the study for various reasons.
end_study_events |>
rmarkdown::paged_table()
You’ll notice that this dataset includes use of emojis under
end_study_label
. In addition to shapes and symbols, ggswim
supports the use of emojis when rendering swimmer plots.
add_marker()
allows users to specify shapes or emojis by
using the appropriate aes()
argument callouts:
label_vals
and label_names
. These are unique
mapping parameters to ggswim, letting it support and wrap both
geom_point()
and geom_label()
at the same
time. Let’s add a layer using emojis via the
end_study_events
dataset:
p <- p +
add_marker(
data = end_study_events,
aes(
x = time_from_initial_infusion,
y = pt_id,
label_vals = end_study_label,
label_names = end_study_name
),
label.size = NA, fill = NA, size = 5
)
p
We’ve successfully made a swimmer plot with lanes and two different
kinds of marker layers. Recall that ggswim works within the ggplot2
framework, therefore customization can be done using the same ggplot2
theme and styling functions users may already be familiar with. It is
typical in this workflow to call scale_color_manual()
to
fix up the names and colors of the plot, but be sure to do so in the
proper order with the use of new_scale_color()
mentioned
earlier!
Let’s fix up this plot so it looks nicer on the eyes:
library(ggplot2)
p <- p +
theme_minimal() +
scale_colour_manual(
name = "Markers",
values = c(NA, NA, "#25DA6D", NA, "firebrick")
) +
labs(title = "My Swimmer Plot") +
xlab("Time (Months)") + ylab("Patient ID")
p
We can also apply the theme_ggswim()
function:
p +
theme_ggswim()
Additional notes
Some additional considerations to keep in mind when working with ggswim:
Handling Missing Data: ggswim does not support missing data for mapping aesthetics. If any are detected, developers will receive a warning, and the missing data may appear as
NA
values in the display, but will be excluded from the legend.Rendering Emojis and Custom Shapes: To ensure emojis and other custom shapes display correctly, users may need to switch their graphics rendering device to AGG.