Mastering Health Data Science Using R

Author

Alice Paul

Published

2024

Preface

This book serves as an interactive introduction to R for public health and health data science students. Topics include data structures in R, exploratory analysis, distributions, hypothesis testing, regression analysis, and larger scale programming with functions and control flows. The presentation assumes knowledge with the underlying methodology and focuses instead on how to use R to implement your analysis.

This book is written using Quarto Book. You can download the Quarto files used to generate this book or a corresponding Jupyter Notebook from the GitHub repository. The GitHub repository also contains a few cheat sheets.

This work is licensed under the Creative Commons Attribution 4.0 International CC BY 4.0.

Acknowledgments

This book was written with the support of a Data Science Institute Seed Grant. Thanks to students Thomas Arnold, Hannah Eglinton, Jialin Liu, Joanna Walsh, and Xinbei Yu for their help and feedback. Please contact Dr. Paul (alice_paul@brown.edu) with questions, suggested edits, or feedback.

Corrections

Corrections will be updated to the online version and posted here for those with the print copy. Thanks to student Gavin Schilling for finding the first correction!

  • Chapter 3.3: assignment of BACK column has been changed to use pmax().