Table Of ContentIntroduction to Statistical Thought
Michael Lavine
November 11, 2007
i
Copyright (cid:13)c 2005 by Michael Lavine
C
ONTENTS
List of Figures vi
List of Tables x
Preface xi
1 Probability 1
1.1 Basic Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Probability Densities . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Parametric Families of Distributions . . . . . . . . . . . . . . . . . . . 14
1.3.1 The Binomial Distribution . . . . . . . . . . . . . . . . . . . . 14
1.3.2 The Poisson Distribution . . . . . . . . . . . . . . . . . . . . . 17
1.3.3 The Exponential Distribution . . . . . . . . . . . . . . . . . . 20
1.3.4 The Normal Distribution . . . . . . . . . . . . . . . . . . . . . 22
1.4 Centers, Spreads, Means, and Moments . . . . . . . . . . . . . . . . 29
1.5 Joint, Marginal and Conditional Probability . . . . . . . . . . . . . . 40
1.6 Association, Dependence, Independence . . . . . . . . . . . . . . . . 51
1.7 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
1.7.1 Calculating Probabilities . . . . . . . . . . . . . . . . . . . . . 57
1.7.2 Evaluating Statistical Procedures . . . . . . . . . . . . . . . . 61
1.8 R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
1.9 Some Results for Large Samples . . . . . . . . . . . . . . . . . . . . . 77
1.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
2 Modes of Inference 93
2.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
2.2 Data Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
2.2.1 Summary Statistics . . . . . . . . . . . . . . . . . . . . . . . . 95
ii
CONTENTS iii
2.2.2 Displaying Distributions . . . . . . . . . . . . . . . . . . . . . 100
2.2.3 Exploring Relationships . . . . . . . . . . . . . . . . . . . . . 113
2.3 Likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
2.3.1 The Likelihood Function . . . . . . . . . . . . . . . . . . . . . 132
2.3.2 Likelihoods from the Central Limit Theorem . . . . . . . . . . 139
2.3.3 Likelihoods for several parameters . . . . . . . . . . . . . . . 144
2.4 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
2.4.1 The Maximum Likelihood Estimate . . . . . . . . . . . . . . . 154
2.4.2 Accuracy of Estimation . . . . . . . . . . . . . . . . . . . . . . 155
2.4.3 The sampling distribution of an estimator . . . . . . . . . . . 158
2.5 Bayesian Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
2.6 Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
2.7 Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
2.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
3 Regression 202
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
3.2 Normal Linear Models . . . . . . . . . . . . . . . . . . . . . . . . . . 210
3.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
3.2.2 Inference for Linear Models . . . . . . . . . . . . . . . . . . . 221
3.3 Generalized Linear Models . . . . . . . . . . . . . . . . . . . . . . . . 236
3.3.1 Logistic Regression . . . . . . . . . . . . . . . . . . . . . . . . 236
3.3.2 Poisson Regression . . . . . . . . . . . . . . . . . . . . . . . . 245
3.4 Predictions from Regression . . . . . . . . . . . . . . . . . . . . . . . 250
3.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
4 More Probability 263
4.1 More Probability Density . . . . . . . . . . . . . . . . . . . . . . . . . 263
4.2 Random Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
4.2.1 Densities of Random Vectors . . . . . . . . . . . . . . . . . . . 265
4.2.2 Moments of Random Vectors . . . . . . . . . . . . . . . . . . 266
4.2.3 Functions of Random Vectors . . . . . . . . . . . . . . . . . . 266
4.3 Representing Distributions . . . . . . . . . . . . . . . . . . . . . . . . 271
4.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
5 Special Distributions 279
5.1 Binomial and Negative Binomial . . . . . . . . . . . . . . . . . . . . 279
5.2 Multinomial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290
5.3 Poisson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292
CONTENTS iv
5.4 Uniform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
5.5 Gamma, Exponential, Chi Square . . . . . . . . . . . . . . . . . . . . 305
5.6 Beta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
5.7 Normal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
5.7.1 The Univariate Normal Distribution . . . . . . . . . . . . . . . 315
5.7.2 The Multivariate Normal Distribution . . . . . . . . . . . . . . 320
5.8 t and F . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
5.8.1 The t distribution . . . . . . . . . . . . . . . . . . . . . . . . . 328
5.8.2 The F distribution . . . . . . . . . . . . . . . . . . . . . . . . 334
5.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334
6 More Models 342
6.1 Hierarchical Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
6.2 Time Series and Markov Chains . . . . . . . . . . . . . . . . . . . . . 343
6.3 Contingency Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
6.4 Survival analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
6.5 The Poisson process . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
6.6 Change point models . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
6.7 Spatial models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358
6.8 Point Process Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 358
6.9 Evaluating and enhancing models . . . . . . . . . . . . . . . . . . . . 358
6.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358
7 Mathematical Statistics 360
7.1 Properties of Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . 360
7.1.1 Sufficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360
7.1.2 Consistency, Bias, and Mean-squared Error . . . . . . . . . . . 363
7.1.3 Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
7.1.4 Asymptotic Normality . . . . . . . . . . . . . . . . . . . . . . 365
7.1.5 Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
7.2 Transformations of Parameters . . . . . . . . . . . . . . . . . . . . . 365
7.3 Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
7.4 More Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . . 365
7.4.1 p values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
7.4.2 The Likelihood Ratio Test . . . . . . . . . . . . . . . . . . . . 366
7.4.3 The Chi Square Test . . . . . . . . . . . . . . . . . . . . . . . 366
7.4.4 Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
7.5 Exponential families . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
CONTENTS v
7.6 Location and Scale Families . . . . . . . . . . . . . . . . . . . . . . . 366
7.7 Functionals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
7.8 Invariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
7.9 Asymptotics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
7.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
Bibliography 375
L F
IST OF IGURES
1.1 pdf for time on hold at Help Line . . . . . . . . . . . . . . . . . . . . 7
1.2 p for the outcome of a spinner . . . . . . . . . . . . . . . . . . . . . 9
Y
1.3 (a): Ocean temperatures; (b): Important discoveries . . . . . . . . . 11
1.4 Change of variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.5 Binomial probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.6 P[X = 3|λ] as a function of λ . . . . . . . . . . . . . . . . . . . . . . 19
1.7 Exponential densities . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.8 Normal densities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.9 Ocean temperatures at 45◦N,30◦W, 1000m depth . . . . . . . . . . . 25
1.10 Normal samples and Normal densities . . . . . . . . . . . . . . . . . 27
1.11 hydrographic stations off the coast of Europe and Africa . . . . . . . 31
1.12 Water temperatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
1.13 Two pdf’s with ±1 and ±2 SD’s. . . . . . . . . . . . . . . . . . . . . . 37
1.14 Water temperatures with standard deviations . . . . . . . . . . . . . 41
1.15 Permissible values of N and X . . . . . . . . . . . . . . . . . . . . . . 44
1.16 Features of the joint distribution of (X,Y) . . . . . . . . . . . . . . . 48
1.17 Lengths and widths of sepals and petals of 150 iris plants . . . . . . . 52
1.18 correlations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
ˆ
1.19 1000 simulations of θ for n.sim = 50, 200, 1000 . . . . . . . . . . . 60
ˆ
1.20 1000 simulations of θ under three procedures . . . . . . . . . . . . . 64
1.21 Monthly concentrations of CO at Mauna Loa . . . . . . . . . . . . . 66
2
1.22 1000 simulations of a FACE experiment . . . . . . . . . . . . . . . . . 69
1.23 Histograms of craps simulations . . . . . . . . . . . . . . . . . . . . . 82
2.1 quantiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
2.2 Histograms of tooth growth . . . . . . . . . . . . . . . . . . . . . . . 101
2.3 Histograms of tooth growth . . . . . . . . . . . . . . . . . . . . . . . 102
vi
LIST OF FIGURES vii
2.4 Histograms of tooth growth . . . . . . . . . . . . . . . . . . . . . . . 103
2.5 calorie contents of beef hot dogs . . . . . . . . . . . . . . . . . . . . 107
2.6 Strip chart of tooth growth . . . . . . . . . . . . . . . . . . . . . . . . 110
2.7 Quiz scores from Statistics 103 . . . . . . . . . . . . . . . . . . . . . 112
2.8 QQ plots of water temperatures (◦C) at 1000m depth . . . . . . . . . 114
2.9 Mosaic plot of UCBAdmissions . . . . . . . . . . . . . . . . . . . . . . 118
2.10 Mosaic plot of UCBAdmissions . . . . . . . . . . . . . . . . . . . . . . 119
2.11 Old Faithful data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
2.12 Waiting time versus duration in the Old Faithful dataset . . . . . . . 123
2.13 Time series of duration and waiting time at Old Faithful . . . . . . . 124
2.14 Time series of duration and waiting time at Old Faithful . . . . . . . 125
2.15 Temperature versus latitude for different values of longitude . . . . . 128
2.16 Temperature versus longitude for different values of latitude . . . . . 129
2.17 Spike train from a neuron during a taste experiment. The dots show
the times at which the neuron fired. The solid lines show times at
which the rat received a drop of a .3 M solution of NaCl. . . . . . . . 130
2.18 Likelihood function for the proportion of red cars . . . . . . . . . . . 134
(cid:80)
2.19 (cid:96)(θ) after y = 40 in 60 quadrats. . . . . . . . . . . . . . . . . . . . 137
i
2.20 Likelihood for Slater School . . . . . . . . . . . . . . . . . . . . . . . 138
2.21 Marginal and exact likelihoods for Slater School . . . . . . . . . . . . 141
2.22 Marginal likelihood for mean CEO salary . . . . . . . . . . . . . . . . 143
2.23 FACE Experiment: data and likelihood . . . . . . . . . . . . . . . . . 146
2.24 Likelihood function for Quiz Scores . . . . . . . . . . . . . . . . . . . 149
2.25 Log of the likelihood function for (λ,θ ) in Example 2.13 . . . . . . . 152
f
2.26 Likelihood function for the probability of winning craps . . . . . . . 157
2.27 Sampling distribution of the sample mean and median . . . . . . . . 160
2.28 Histograms of the sample mean for samples from Bin(n,.1) . . . . . . 162
2.29 Prior, likelihood and posterior in the seedlings example . . . . . . . . 169
2.30 Prior, likelihood and posterior densities for λ with n = 1,4,16 . . . . 171
2.31 Prior, likelihood and posterior densities for λ with n = 60 . . . . . . . 172
2.32 Prior, likelihood and posterior density for Slater School . . . . . . . . 173
2.33 Plug-in predictive distribution for seedlings . . . . . . . . . . . . . . 176
2.34 Predictive distributions for seedlings after n = 0,1,60 . . . . . . . . . 179
2.35 pdf of the Bin(100,.5) distribution . . . . . . . . . . . . . . . . . . . . 184
2.36 pdfs of the Bin(100,.5) (dots) and N(50,5) (line) distributions . . . . 185
2.37 Approximate density of summary statistic t . . . . . . . . . . . . . . . 186
2.38 Number of times baboon father helps own child . . . . . . . . . . . . 190
2.39 Histogram of simulated values of w.tot . . . . . . . . . . . . . . . . . 191
LIST OF FIGURES viii
3.1 Four regression examples . . . . . . . . . . . . . . . . . . . . . . . . 203
3.2 1970 draft lottery. Draft number vs. day of year . . . . . . . . . . . . 206
3.3 Draft number vs. day of year with smoothers . . . . . . . . . . . . . . 207
3.4 Total number of New seedlings 1993 – 1997, by quadrat. . . . . . . . 209
3.5 Calorie content of hot dogs . . . . . . . . . . . . . . . . . . . . . . . 211
3.6 Density estimates of calorie contents of hot dogs . . . . . . . . . . . . 213
3.7 The PlantGrowth data . . . . . . . . . . . . . . . . . . . . . . . . . . 215
3.8 Ice cream consumption versus mean temperature . . . . . . . . . . . 222
3.9 Likelihood functions for (µ,δ ,δ ) in the Hot Dog example. . . . . . 228
M P
3.10 pairs plot of the mtcars data . . . . . . . . . . . . . . . . . . . . . . 230
3.11 mtcars — various plots . . . . . . . . . . . . . . . . . . . . . . . . . . 233
3.12 likelihood functions for β , γ , δ and δ in the mtcars example. . . . 235
1 1 1 2
3.13 Pine cones and O-rings . . . . . . . . . . . . . . . . . . . . . . . . . . 238
3.14 Pine cones and O-rings with regression curves . . . . . . . . . . . . . 239
3.15 Likelihood function for the pine cone data . . . . . . . . . . . . . . . 242
3.16 Actual vs. fitted and residuals vs. fitted for the seedling data . . . . . 247
3.17 Diagnostic plots for the seedling data . . . . . . . . . . . . . . . . . . 249
3.18 Actual mpg and fitted values from three models . . . . . . . . . . . . 251
3.19 Happiness Quotient of bankers and poets . . . . . . . . . . . . . . . . 256
4.1 The (X ,X ) plane and the (Y ,Y ) plane . . . . . . . . . . . . . . . . 270
1 2 1 2
4.2 pmf’s, pdf’s, and cdf’s . . . . . . . . . . . . . . . . . . . . . . . . . . 272
5.1 The Binomial pmf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
5.2 The Negative Binomial pmf . . . . . . . . . . . . . . . . . . . . . . . 289
5.3 Poisson pmf for λ = 1,4,16,64 . . . . . . . . . . . . . . . . . . . . . . 295
5.4 Rutherford and Geiger’s Figure 1 . . . . . . . . . . . . . . . . . . . . 300
5.5 Numbers of firings of a neuron in 150 msec after five different tas-
tants. Tastants: 1=MSG .1M; 2=MSG .3M; 3=NaCl .1M; 4=NaCl
.3M; 5=water. Panels: A: A stripchart. Each circle represents one
delivery of a tastant. B: A mosaic plot. C: Each line represents one
tastant. D: Likelihood functions. Each line represents one tastant. . 302
5.6 ThelineshowsPoissonprobabilitiesforλ = 0.2; thecirclesshowthe
fraction of times the neuron responded with 0, 1, ..., 5 spikes for
each of the five tastants. . . . . . . . . . . . . . . . . . . . . . . . . . 304
5.7 Gamma densities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
5.8 Exponential densities . . . . . . . . . . . . . . . . . . . . . . . . . . . 310
5.9 Beta densities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
5.10 Water temperatures (◦C) at 1000m depth . . . . . . . . . . . . . . . 316
LIST OF FIGURES ix
5.11 Bivariate Normal density . . . . . . . . . . . . . . . . . . . . . . . . . 323
5.12 Bivariate Normal density . . . . . . . . . . . . . . . . . . . . . . . . . 325
5.13 t densities for four degrees of freedom and the N(0,1) density . . . . 333
6.1 Graphical representation of hierarchical model for fMRI . . . . . . . 343
6.2 Some time series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
6.3 Y vs. Y for the Beaver and Presidents data sets . . . . . . . . . . . 347
t+1 t
6.4 Y vs. Y for the Beaver data set and lags 0–5 . . . . . . . . . . . . 348
t+k t
6.5 coplot of Y ∼ Y |Y for the Beaver data set . . . . . . . . . . . . 350
t+1 t−1 t
6.6 Fit of CO data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
2
6.7 DAX closing prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
6.8 DAX returns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
7.1 The Be(.39,.01) density . . . . . . . . . . . . . . . . . . . . . . . . . . 370
¯
7.2 Densities of Y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372
in
7.3 Densities of Z . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373
in