14.2 R for Data Science 1

14.2.1 Introduction

R for Data Science (2e) covers how to do data science with R using the tidyverse collection of packages. There are many possible reasons for the continued popularity of tidyverse, but one stand out principle the package authors embrace is that programs should be “easy to use by humans. Computer efficiency is a secondary concern” (see this and more at the tidy tools manifesto. Here you will learn more about what kind of data can be organized into data.frames and possible ways to summarize, explore, and visualize datasets.

14.2.2 Activity

Estimated time: 50 min

14.2.2.1 Instructions

  1. Complete the “r4ds” LearnR tutorial.
  • In addition to the “OCS – Global Diets” section being optional, you similarly do not need to complete all of the Exercises in 3.3.1.
  1. Answer the questions below.

14.2.2.2 Questions

1. data.frames – Data frames organize data into a 2-dimensional table of rows and columns like a spreadsheet with some special restrictions. For the mpg data.frame, what do rows represent? What do columns represent?


2. Functions – Most R commands are functions (e.g. mean()) which take some input (a number, a vector, a dataframe…) and produce some output (a summary, a plot, a new vector…). Describe one function you learned about that you anticipate being useful in the future.


3. Common Problems – What are some common problems you ran into? Refer to Part 2. Troubleshooting if you’d like some inspiration for how to put your experience into words.


4. [optional] OCS - Global Diets Plotting – Which foods show a difference based on sex?


14.2.3 Grading Criteria

  • Download as Microsoft Word (.docx) and upload on Canvas

14.2.4 Footnotes

Resources

Contributions and Affiliations

  • Katherine Cox, Johns Hopkins University
  • Frederick Tan, Johns Hopkins University

Last Revised: February 2025