Visualizing various types of data

Lecture 3

2024-05-20

Warm up

Questions ??

From last time

Violin plots

ggplot(
  penguins,
  aes(
    x = species,
    y = body_mass_g
    )
  ) +
  geom_violin()

Multiple geoms

ggplot(
  penguins,
  aes(
    x = species,
    y = body_mass_g
    )
  ) +
  geom_violin() +
  geom_point()

Multiple geoms

ggplot(
  penguins,
  aes(
    x = species,
    y = body_mass_g
    )
  ) +
  geom_violin() +
  geom_jitter()

Multiple geoms + aesthetics

ggplot(
  penguins,
  aes(
    x = species,
    y = body_mass_g,
    color = species
    )
  ) +
  geom_violin() +
  geom_jitter()

Multiple geoms + aesthetics

ggplot(
  penguins,
  aes(
    x = species,
    y = body_mass_g,
    color = species
    )
  ) +
  geom_violin() +
  geom_jitter() +
  theme(
    legend.position = "none"
  )

Multiple geoms + aesthetics

ggplot(
  penguins,
  aes(
    x = species,
    y = body_mass_g,
    color = species
    )
  ) +
  geom_violin() +
  geom_jitter() +
  theme(
    legend.position = "none"
  ) +
  scale_color_colorblind()

Questions from previous material

  • Is there any code in the videos that is not in the readings? Yes and no. There is no substantial functionality introduced in the videos that is not also in the readings, however the examples in the videos are different than the ones in the reading.

  • What are all of the geoms we need to know? You don’t need to “memorize” or even “know” all o the geoms available in the ggplot2 package, but you can find a list of them on the ggplot2 cheat sheet or on the reference page.

  • Could you please clarify what situations it would be appropriate to use each geom function? Today’s topic! And think about it as “what plot should I make for which type of variable”.

Let’s return to AE-02

ae-02-bechdel-dataviz

Go to the project navigator in RStudio (top right corner of your RStudio window) and open the project called ae. If there are any uncommitted files, commit them so you can start with a clean slate.

Recap of AE

  • Construct plots with ggplot().
  • Layers of ggplots are separated by +s.
  • The formula is (almost) always as follows:
ggplot(DATA, aes(x = X-VAR, y = Y-VAR, ...)) +
  geom_XXX()
  • Aesthetic attributes of a geometries (color, size, transparency, etc.) can be mapped to variables in the data or set by the user, e.g. color = binary vs. color = "pink".
  • Use facet_wrap() when faceting (creating small multiples) by one variable and facet_grid() when faceting by two variables.