14.4 R for Data Science 2

14.4.1 Introduction

R for Data Science (2e) describes the importance of data visualization by saying that “a good visualization will show you things you did not expect or raise new questions about the data”. Here you will learn more about the ggplot2 system for making graphs which is an elegant and versatile complement to what is available through software like Google Sheets. Increasingly more software packages use the ggplot2 system such as the Bioconductor phyloseq package used for 16S rDNA analysis.

14.4.2 Activity

Estimated time: 25 min

14.4.2.1 Instructions

  1. Start the “r4ds2” LearnR tutorial.
  • Focus on the first half of the tutorial, up to and including “4. Structure of a ggplot() command”
  1. Bug fixes
  1. For the Chapter 3, Quiz 2 on geom_bar() aesthetics, you must check “size”

  2. For the Chapter 4 exercises, you will need to add this code to the top of each code block

measles <- filter(us_contagious_diseases, disease=="Measles") measles_MD <- filter(measles, state=="Maryland") measles_VA <- filter(measles, state=="Virginia")

14.4.2.2 Questions

There are seven examples of broken ggplot code at the end of “4. Structure of a ggplot() command”. Fix at least three of them and for one of them explain what was wrong and how you were able to figure it out. Remember that you must add the above three lines to the top of each code block.

1. What was the error?


2. How did you figure it out?


14.4.3 Grading Criteria

  • Download as Microsoft Word (.docx) and upload on Canvas

14.4.4 Footnotes

Resources

Contributions and Affiliations

  • Katherine Cox, Johns Hopkins University
  • Frederick Tan, Johns Hopkins University

Last Revised: February 2025