The R procedures and datasets provided here correspond to many of the examples discussed in R.K. Pearson, Exploring Data in Engineering, the Sciences, and Medicine. The R procedures are provided as text files (.txt) that may be copied and pasted into an interactive R session, and the datasets are provided as comma-separated value (.csv) files. These files are easily read in R via the read.csv command, or they may be examined by opening them in Microsoft Excel. Note that the R procedures described here are built on commands available in base R and the add-on packages designated as recommended, and do not require any other add-on packages. These commands were implemented in R version 2.11.1, installed as binary files in a Microsoft Windows environment.
Note that versions of a number of these datasets are available as built-in datasets in a variety of R packages (e.g., the von Bortkewitsch horsekick deaths data is available in the R add-on package vcd as the dataset VonBort). In addition, three of these datasets (federalist.csv, horsekick.csv, and bitterpit.csv) were constructed from datasets described in the book Data by D.F. Andrews and A.M. Herzberg (Springer-Verlag, New York, 1985) and available from the following website:
http://lib.stat.cmu.edu/datasets/Andrews/
Similarly, the datasets mushroom.csv and pima.csv were constructed from datasets available from the UCI Machine Learning Repository (Frank, A. & Asuncion, A. (2010). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science).
Chapter 1 – The Art of Analyzing Data
Chapter 2 – Data: Types, Uncertainty, and Quality
Chapter 3 – Characterizing Categorical Variables
Chapter 4 – Uncertainty in Real Variables
Chapter 5 – Fitting Straight Lines
Chapter 6 – A Brief Introduction to Estimation Theory
Chapter 7 – Outliers: Distributional Monsters (?) That Lurk in Data
Chapter 8 – Characterizing a Dataset
Chapter 9 – Confidence Intervals and Hypothesis Testing
Chapter 10 – Relations among Variables
Chapter 11 – Regression Models I: Real Data
Chapter 12 – Reexpression: Data Transformations
Chapter – 13: Regression Models II: Mixed Data Types
Chapter 14 – Characterizing Analysis Results
Chapter 15 – Regression Models III: Diagnostics and Refinements
Chapter 16 – Dealing with Missing Data