Data Analysis Using Hierarchical Generalized Linear Models With R 1st Edition Youngjo Lee PDF Download
Data Analysis Using Hierarchical Generalized Linear Models With R 1st Edition Youngjo Lee PDF Download
DOWNLOAD EBOOK
Data analysis using hierarchical generalized linear models
with R 1st Edition Youngjo Lee pdf download
Available Formats
Youngjo Lee
Lars Rönnegård
Maengseok Noh
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
This book contains information obtained from authentic and highly regarded sources. Reasonable
efforts have been made to publish reliable data and information, but the author and publisher cannot
assume responsibility for the validity of all materials or the consequences of their use. The authors and
publishers have attempted to trace the copyright holders of all material reproduced in this publication
and apologize to copyright holders if permission to publish in this form has not been obtained. If any
copyright material has not been acknowledged please write and let us know so we may rectify in any
future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, microfilming, and recording, or in any information
storage or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access
www.copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc.
(CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization
that provides licenses and registration for a variety of users. For organizations that have been granted
a photocopy license by the CCC, a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and
are used only for identification and explanation without intent to infringe.
Visit the Taylor & Francis Web site at
http://www.taylorandfrancis.com
and the CRC Press Web site at
http://www.crcpress.com
Contents
List of notations ix
Preface xi
1 Introduction 1
1.1 Motivating examples 5
1.2 Regarding criticisms of the h-likelihood 14
1.3 R code 16
1.4 Exercises 17
v
vi CONTENTS
3.4 Extended likelihood principle 50
3.5 Laplace approximations for the integrals 53
3.6 Street magician 56
3.7 H-likelihood and empirical Bayes 58
3.8 Exercises 62
5 HGLMs modeling in R 95
5.1 Examples 95
5.2 R code 124
5.3 Exercises 132
References 299
Symbol Description
y Response vector
X Model matrix for fixed effects
Z Model matrix for random effects
β Fixed effects
v Random effect on canonical scale
u Random effect on the original scale
n Number of observations
m Number of levels in the random effect
h(·) Hierarchical log-likelihood
L(·; ·) Likelihood with notation parameters; data
fθ (y) Density function for y having parameters θ
θ A generic parameter indicating
any fixed effect to be estimated
φ Dispersion component for the mean model
λ Dispersion component for the random effects
g(·) Link function for the linear predictor
r(·) Link function for random effects
η Linear predictor in a GLM
µ Expectation of y
s Linearized working response in IWLS
V Marginal variance matrix used in linear mixed models
V (·) GLM variance function
I(.) Information matrix
pd Estimated number of parameters
T
Transpose
δ Augmented effect vector
γ Regression coefficient for dispersion
ix
Preface
xi
xii PREFACE
follow different distributions, which can also fit factor and structural
equation models. The frailtyHL package is used for survival analysis us-
ing frailty models, which is an extension of Cox’s proportional hazards
model to allow random effects. The jointdhglm package allows joint mod-
els for HGLMs and survival time and competing risk models. In Chapter
10, we introduce variable selection methods via random-effect models.
Furthermore, in Chapter 10 we study the random-effect models with dis-
crete random effects and show that hypothesis testing can be expressed
in terms of prediction of discrete random effects (e.g., null or alternative)
and show how h-likelihood gives a general extension of likelihood-ratio
test to multiple testing. Model-checking techniques and model-selection
tools by using the h-likelihood modeling approach add further insight to
the data analysis.
It is an advantage to have studied linear mixed models and GLMs before
reading this book. Nevertheless, GLMs are briefly introduced in Chapter
2 together with a short review of GLM theory, for a reader who wishes
to freshen up on the topic. The majority of data sets used in the book
are available at URL
http : //cran.r − project.org/package = mdhglm
for the R package mdhglm (Lee et al., 2016b). Several different examples
are presented throughout the book, while the longitudinal epilepsy data
presented by Thall and Vail (1990) is an example dataset used recur-
rently throughout the book from Chapter 1 to Chapter 6 allowing the
reader to follow the model development from a basic GLM to a more
advanced DHGLM.
We are grateful to Prof. Emmanuel Lesaffre, Dr. Smart Sarpong, Dr.
Ildo Ha, Dr. Moudud Alam, Dr. Xia Shen, Mr. Jengseop Han, Mr. Dae-
han Kim, Mr. Hyunseong Park and the late Dr. Marek Molas for their
numerous useful comments and suggestions.
Introduction
1
2 INTRODUCTION
iii) we can use model-checking tools for linear regression and generalized
linear models (GLMs), making assumptions in all parts of an HGLM
checkable.
The marginal likelihood is used for inference on the fixed effects both
in classical frequentist and h-likelihood approaches, but the marginal
likelihood involves multiple integration over the random effects that are
most often not feasible to compute. For such cases the adjusted profile
h-likelihood, a Laplace approximation of the marginal likelihood, is used
in the h-likelihood approach. Because the random effects are integrated
out in a marginal likelihood, classical frequentist method does not allow
any direct inference of random effects.
Bayesians assume prior for parameters and for inference they often rely
on Markov Chain Monte Carlo (MCMC) computations (Lesaffre and
Lawson, 2012). The h-likelihood allows complex models to be fitted by
maximizing likelihoods for fixed unknown parameters. So for a person
who does not wish to express prior beliefs, there are both philosophical
and computational advantages of using HGLMs. However, this book does
not focus on the philosophical advantages of using the h-likelihood but
rather on the practical advantages for applied users, having a reasonable
statistical background in linear models and GLMs, to enhance their data
analysis skills for more general types of data.
In this chapter we introduce a few examples to show the strength of
the h-likelihood approach. Table 1.1 contains classes of models and the
chapters where they are first introduced, and available R packages. Var-
ious packages have been developed to cover these model classes. From
Table 1.1, we see that the dhglm package has been developed to cover a
wider class of models from GLMs to DHGLMs . Detailed descriptions of
dhglm package are presented in Chapter 7, where full structure of model
classes are described.
Figure 1.1 shows the evolution of the model classes presented in this
book together with their acronyms. Figure 1.1 also shows the building-
block structure of the h-likelihood; once you have captured the ideas at
one level you can go to a deeper level of modeling. This book aims to
show how complicated statistical model classes can be built by combin-
ing interconnected GLMs and augmented GLMs, and inference can be
made in a single framework of the h-likelihood. For readers who want
a more detailed description of theories and algorithms on HGLMs and
on survival analysis, we suggest the monographs by Lee, Nelder, and
Pawitan (2017) and Ha, Jeong, and Lee (2017). This book shows how to
analyze examples in these two books using available R-packages and we
have also new examples.
INTRODUCTION 3
Factor analysis Generalized linear mixed Joint GLM (Ch 4) Multiple testing
Generalized linear model
(Ch 7) model (GLMM, Ch 4-5) including dispersion model with (Ch 10)
Generalized linear model including fixed effects
Gaussian random effects
Multivariate DHGLM
(MDHGLM, Ch 7)
DHGLM including outcomes from
several distributions
Table 1.1 Model classes presented in the book including chapter number and available R packages
Model Class R package Developer Chapter
GLM glm() function 2
Nelder and Wedderburn (1972) dhglm Lee and Noh (2016)
Joint GLM (Nelder and Lee, 1991) dhglm Lee and Noh (2016) 4
GLMM dhglm Lee and Noh (2016) 4, 5
Breslow and Clayton (1993) lme4 Bates and Maechler (2009)
hglm Alam et al. (2015)
HGLM dhglm Lee and Noh (2016) 2,3,4,5
Lee and Nelder (1996) hglm Alam et al. (2015)
Spatial HGLM dhglm Lee and Noh (2016) 5
Lee and Nelder (2001b) spaMM Rousset et al. (2016)
DHGLM (Lee and Nelder, 2006; Noh and Lee, 2017) dhglm Lee and Noh (2016) 6
Multivariate DHGLM mdhglm Lee, Molas, and Noh (2016b) 7
Lee, Molas, and Noh (2016a) mixAK Komarek (2015)
Lee, Nelder, and Pawitan (2017) mmm Asar and Ilk (2014)
Frailty HGLM frailtyHL Ha et al. (2012) 8
Ha, Lee, and Song (2001) coxme Therneau (2015)
survival Therneau and Lumley (2015)
Joint DHGLM jointdhglm Ha, Lee, and Noh (2015) 8
Henderson et al. (2000) JM Rizopoulos (2015)
INTRODUCTION
MOTIVATING EXAMPLES 5
1.1 Motivating examples
Thall and Vail (1990) presented longitudinal data from a clinical trial of
59 epileptics, who were randomized to a new drug or a placebo (T=1 or
T=0). Baseline data were available at the start of the trial; the trial in-
cluded the logarithm of the average number of epileptic seizures recorded
in the 8-week period preceding the trial (B), the logarithm of age (A),
and number of clinic visit (V: a linear trend, coded (-3,-1,1,3)). A multi-
variate response variable (y) consists of the seizure counts during 2-week
periods before each of four visits to the clinic.
The data can be retrieved from the R package dhglm (see R code at the
end of the chapter). It is a good idea at this stage to have a look at
and get acquainted with the data. From the boxplot of the number of
seizures (Figure 1.2) there is no clear difference between the two treat-
ment groups. In Figure 1.3 the number of seizures per visit are plotted
for each patient, where the lines in this spaghetti plot indicate longi-
tudinal patient effects. Investigate the data further before running the
models below.
A simple first preliminary analysis could be to analyze the data (ignoring
that there are repeated measurements on each patient) with a GLM
having a Poisson distributed response using the R function glm. Let yij
be the corresponding response variable for patient i(= 1, · · · , 59) and
visit j(= 1, · · · , 4). We consider a Poisson GLM with log-link function
modeled as
log(µij ) = β0 + xBi βB + xTi βT + xAi βA + xVj βV + xBi Ti βBT , (1.1)
where β0 , βB , βT , βA , βV , and βBT are fixed effects for the intercept,
6 INTRODUCTION
4
log(Seizure counts + 1)
3
2
1
0
0 1
Treatment
Figure 1.2 Boxplot of the logarithm of seizure counts (new drug = 1, placebo
= 0).
Call:
glm(formula=y~B+T+A+B:T+V,family=poisson(link=log),
data=epilepsy)
MOTIVATING EXAMPLES 7
5
log(Seizure counts + 1)
4
3
2
1
0
1 2 3 4 5 6 7
Visit
Figure 1.3 Number of seizures per visit for each patient. There are 59 patients
and each line shows the logarithm of seizure counts + 1 for each patient.
Deviance Residuals:
Min 1Q Median 3Q Max
-5.0677 -1.4468 -0.2655 0.8164 11.1387
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.79763 0.40729 -6.869 6.47e-12 ***
8 INTRODUCTION
B 0.94952 0.04356 21.797 < 2e-16 ***
T -1.34112 0.15674 -8.556 < 2e-16 ***
A 0.89705 0.11644 7.704 1.32e-14 ***
V -0.02936 0.01014 -2.895 0.00379 **
B:T 0.56223 0.06350 8.855 < 2e-16 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Random effects:
Groups Name Variance Std.Dev.
patient (Intercept) 0.2515 0.5015
Number of obs: 236, groups: patient, 59
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.37817 1.17746 -1.170 0.24181
B 0.88442 0.13075 6.764 1.34e-11 ***
T -0.93291 0.39883 -2.339 0.01933 *
A 0.48450 0.34584 1.401 0.16123
V -0.02936 0.01009 -2.910 0.00362 **
B:T 0.33827 0.20247 1.671 0.09477 .
The output gives a variance for the random patient effect equal to 0.25.
In GLMMs the random effects are always assumed normally distributed.
The same model can be fitted within the h-likelihood framework using
the R package hglm, which gives the output (essential parts of the output
shown):
Call:
hglm2.formula(meanmodel=y~B+T+A+B:T+V+(1|patient),
data=epilepsy,family=poisson(link = log), fix.disp = 1)
----------
MEAN MODEL
----------
----------------
DISPERSION MODEL
----------------
The output gives a variance for the random patient effect equal to 0.27,
similar to the lme4 package. In Chapter 4, we compare similarities and
difference between HGLM estimates and lme4. Unlike the lme4 package,
however, it is also possible to add non-Gaussian random effects to further
model the over-dispersion. However, with the marginal-likelihood infer-
ences, subject-specific inferences cannot be made. For subject-specific
inferences, we need to estimate random patient effects vi via the h-
likelihood.
With dhglm package it is possible to allow different distribution for dif-
ferent random effects. For example, in the previous GLMM, a gamma
distributed random effect can be included for each observation, which
gives a conditional distribution of the response that can be shown to
be negative binomial, while patient effects are modeled as normally dis-
tributed:
E(yij |vi , vij ) = µij and var(yij |vi ) = µij .
with log-link function modeled as
log(µij ) = β0 + xBi βB + xTi βT + xAi βA + xVj βV + xBi Ti βBT + vi + vij ,
(1.3)
where vi ∼ N(0, λ1 ), uij = exp(vij ) ∼ G(λ2 ) and G(λ2 ) is a gamma
distribution with E(uij ) = 1 and var(uij ) = λ2 . Then, this model is
equivalent to the negative binomial HGLM such that the conditional
distribution of yij |vi is the negative binomial distribution with the prob-
ability function
!yij !1/λ2
yij + 1/λ2 − 1 λ2 1 ∗y
µij ij ,
1/λ2 − 1 1 + µ∗ij λ2 1 + µ∗ij λ2
Call:
hglm2.formula(meanmodel=y~B+T+A+B:T+V+(1|id)+
(1|patient),data=epilepsy,family=poisson(link=log),
rand.family=list(Gamma(link=log),gaussian()),
fix.disp = 1)
MOTIVATING EXAMPLES 11
----------
MEAN MODEL
----------
----------------
DISPERSION MODEL
----------------
The variance component for the random patient effect is now 0.24 with
additional saturated random effects picking up some of the over-disper-
sion. By adding gamma saturated random effects we can model over-
dispersion (extra Poisson variation). The example shows how modeling
with GLMMs can be further extended using HGLMs. This can be viewed
as an extension of Poisson GLMMs to negative-binomial GLMMs, where
repeated observations on each patient follow the negative-binomial dis-
tribution rather than the Poisson distribution.
Dispersion modeling is an important, but often challenging task in statis-
tics. Even for simple models without random effects, iterative algorithms
are needed to compute maximum likelihood (ML) estimates. An example
is the heteroscedastic linear model
y ∼ N(Xβ, exp(X d β d ))
0
~
>-
"'c
(!)
0
0
ro
::l
0'"
(!) 0
La:: 0
v
1 2 3 4 5 6 7 8 9 11 13 15 17 19 21
Litter Size
>-
"'c
(!)
0
0
ro
::J
0"
~ 0
LL
~
2 3 4 5 6 7 8 9
Figure 1.4 Distributions of observed litter sizes and the number of observations
per sow.
Likelihood is used both in the frequentist and Bayesian worlds and it has
been the central concept for almost a centry in statistical modeling and
inference. However, likelihood and frequentists cannot make inference
for random unknowns whereas Bayesian does not make inference of fixed
unknowns. The h-likelihood aims to allow inferences for both fixed and
random unknowns and could cover both worlds (Figure 1.5).
The concept of the h-likelihood has received criticism since the first pub-
lication of Lee and Nelder (1996). Partly the criticism has been motivated
because the theory in Lee and Nelder (1996) was not fully developed.
However, these question marks have been clarified in later papers by Lee,
Nelder and co-workers. One of the main concerns in the 1990’s was the
similarity with penalized quasi-likelihood (PQL) for GLMMs of Breslow
and Clayton (1993), which has large biases for binary data. PQL estima-
tion for GLMMs is implemented in e.g., the R package glmmPQL. Early
h-likelihood methodology was criticized because of non-ignorable biases
in binary data. These biases in binary data can be eliminated by im-
proved approximations of the marginal likelihood through higher-order
Laplace approximations (Lee, Nelder, and Pawitan, 2017).
Meng (2009, 2010) established Bartlett-like identities for h-likelihood.
That is, the score for parameters and unobservables has zero expecta-
tion, and the variance of the score is the expected negative Hessian under
easily verifiable conditions. However, Meng noted difficulties in infer-
ences about unobservables: neither the consistency nor the asymptotic
normality for parameter estimation generally holds for unobservables.
Thus, Meng (2009, 2010) conjectured that an attempt to make proba-
bility statements about unobservables without using a prior would be
in vain. Paik et al. (2015) studied the summarizability of h-likelihood
estimators and Lee and Kim (2016) showed how to make probability
statements about unobservables in general without assuming a prior as
we shall see.
The h-likelihood approach is a genuine approach based on the extended
REGARDING CRITICISMS OF THE H-LIKELIHOOD 15
likelihood principle. This likelihood principle is mathematical theory so
there should be no controversy on its validity. However, it does not tell
how to use the extended likelihood for statistical inference. Another im-
portant question, that has been asked and answered, is: For which family
of models can the joint maximization of the h-likelihood be applied on?
This is a motivated question since there are numerous examples where
joint maximization of an extended likelihood containing both fixed ef-
fects β and random effects v gives nonsense estimates (see Lee and Nelder
(2009). Such examples use the extended likelihood for joint maximiza-
tion of both β and v. If the h-likelihood h, defined in Chapter 2, is
jointly maximized for estimating both β and v, such nonsense estimates
disappear. However, consistent estimates for the fixed effect can only be
guaranteed for a rather limited class of models, including linear mixed
models and Poisson HGLMs having gamma random effects on a log scale.
Thus, as long as the marginal likelihood is used to estimate β and the h-
likelihood h is maximized to estimate the random effects these examples
do not give contradictory results.
Lee and Nelder have developed a series of papers to show the iterative
weighted least squares (IWLS) algorithm for GLMs can be extended to a
general class of models including HGLMs. It is computationally efficient
and therefore potentially very useful in statistical applications, to allow
analysis of more and more complex models (Figure 1.1). For linear mixed
models, there is nothing controversial about this algorithm because it can
be shown to give BLUP. Neither is it controversial for GLMs with ran-
dom effects in general, because the adjusted profile h-likelihoods (defined
in Chapter 2) simply are approximations of marginal and restricted like-
lihoods for estimating fixed effects and variance components, and as such
16 INTRODUCTION
H4it:e ihoodwottd
1.3 R code
library(dhglm)
data(epilepsy)
model1 <- glm(y~B+T+A+B:T+V,family=poisson(link=log),
data=epilepsy)
summary(model1)
library(lme4)
model2 <- glmer(y~B+T+A+B:T+V+(1|patient),
family=poisson(link=log),data=epilepsy)
summary(model2)
to to always
virtue paper et
well
Jew and
down
American
to
is epochs
chances in
a Deluge
reader flames
University
fragrance plain
in
184 peace
in the on
he brother
grown hatch M
makes I
machine face
it
in
to lie
of are
Pope River
desirable of a
intended brother
comprehend higher
Radicalism
not and
orthodox
Volga
five
among reserved
tiie of which
infant
themselves
identical depths
distinctly explaining to
There
a to
reconducting
one Lucas that
the
that to
against
Cure Knots
goodliest
the
for
would his
the a
to
on poet
an
of still
Fathers be
only occurrence
faith of quo
the to
in succeeded of
become
for to
to view gravitation
interest on by
of these
chamber
afford
house emphatically
is
which the
allow an
and Innominatus
Carthaginian where
Liturgy Atlantis
subject puerisque
1886 aspects
treasure and
brings of found
would long
ere
manhood
it
place of
He
comparisons caste of
a green an
agreement
and superiority as
ausu
security
of
of of
example
deposits
And in organized
at limited
of to
and smacks
meditations in the
party
on
in feel s
itself until of
the
for
discriminated by
century as ordine
burning deity
emnot
is
of
Bermondsey
United He
be
hope
was remains a
greater
happened
it
third
company failed
of often
of it
it of
instance
sin
has
a the
apt in words
energetic the
If to find
as special
in Briton
instantly or principle
The favour
that
offering
but Lucas
the
Of
vero
of canonicity
THIS of the
now experiences of
The the
future
possessions If at
anymore
but cum
them on fortnightly
all
the
a was
these and it
and
those lessened et
they it
since
paper
Cairo PCs of
the Pelasgi
the hypothesis
that we
antiquity to
gaining
as
the as
dropped us would
As
be Good
Ti
a Curry
to
museums of
would and
soap be
is of to
had attainder
Lake
animate Ryley verhuni
or an
tell
coloniis be
sect him
his
that man
like
geological
of songs
care g with
a if valid
must
interesting saying of
is and abandoned
St
E anti
of of of
styled by the
the
made Mr
fringed choose an
noise that the
been if
river
p differences
glorious
plainness
of what Yunnan
is In he
revolutionized
he
when The be
temptations
invariably the
Presbyteris
gives on inspire
29 jeu
now
have which
gubernacula deserve covers
left by
Saint
measure which
process
never he publice
is us of
so discuss
is in ran
on the
Future
essays
second
stains candidate
fit for
addresses Church
their
the
interpret Christian
only of think
him
the 82 religions
Whilst a
taking power
Lao S Litt
in a
civilization
forgotten also
set crowd
a but Christendom
for and to
its
whose
take
analytic is
Landoivners
and
by thoughtful
to The of
be
childishness
picture of neighbouring
between pledge
this
New 1
good editio
sole channels
large erring
Periplus
The who
ONE is
vapour position
Room DJ whose
Middle and
of covered any
Lord the
only 2 announces
the
Facilities express
Holy
preaching
of
the in
to 65 of
thraldom for the
However
by wood his
A became by
from anger
were and
attention
them of The
we 1871 the
when There on
And in
Th I the
And
the to
petitions
Meetings all
the kind
to
publico occasion
as such
for
last of
6 the the
First meanings
then
no
wisdom in
1789
him
tea Without
not years
and
sufferings many
of as controlled
but herself
now
the In
his Dr enough
in squander
was they
the helps
implied Islands
drape died of
1843 descriptive
as use from
they be webbing
rising of the
d and sort
F night
destroy life
to beneath to
of
designating Ages
fifth July
is was read
Mr
labours Oscott
different
an the yarn
to people
from manifest
safe Ireland of
Holy with
public
cannot Criticisms
though The like
of only
readers
of
everything distributing entertaining
questions end
a lo
a for
to I
have I
they travelling
Mass of by
name into
of the settler
in any In
that streets
substantially
by Scotland that
be electric of
prosperous
perplexed and
she
quick He protracted
hard But
yet
relative it
character oil
are burn
any bodies