
14.2 R for Data Science 1
14.2.1 Introduction
R for Data Science (2e) covers how to do data science with R using the tidyverse collection of packages. There are many possible reasons for the continued popularity of tidyverse, but one stand out principle the package authors embrace is that programs should be “easy to use by humans. Computer efficiency is a secondary concern” (see this and more at the tidy tools manifesto. Here you will learn more about what kind of data can be organized into data.frames and possible ways to summarize, explore, and visualize datasets.
14.2.2 Activity
Estimated time: 50 min
14.2.2.1 Instructions
- Complete the “r4ds” LearnR tutorial.
- In addition to the “OCS – Global Diets” section being optional, you similarly do not need to complete all of the Exercises in 3.3.1.
- Answer the questions below.
14.2.2.2 Questions
1. data.frames – Data frames organize data into a 2-dimensional table of rows and columns like a spreadsheet with some special restrictions. For the mpg data.frame, what do rows represent? What do columns represent? |
---|
2. Functions – Most R commands are functions (e.g. mean() ) which take some input (a number, a vector, a dataframe…) and produce some output (a summary, a plot, a new vector…). Describe one function you learned about that you anticipate being useful in the future. |
---|
3. Common Problems – What are some common problems you ran into? Refer to Part 2. Troubleshooting if you’d like some inspiration for how to put your experience into words. |
---|
4. [optional] OCS - Global Diets Plotting – Which foods show a difference based on sex? |
---|
14.2.4 Footnotes
Resources
- Google Doc
- R cheat sheet
Contributions and Affiliations
- Katherine Cox, Johns Hopkins University
- Frederick Tan, Johns Hopkins University
Last Revised: February 2025