100% found this document useful (1 vote)
27 views94 pages

Data Analysis Using Hierarchical Generalized Linear Models With R 1st Edition Youngjo Lee PDF Download

The document is about the book 'Data Analysis Using Hierarchical Generalized Linear Models with R' by Youngjo Lee, which provides a comprehensive guide to using hierarchical generalized linear models (HGLMs) for data analysis. It includes practical examples, R code, and discussions on various statistical models and inference methods, aimed at students and researchers. The book emphasizes the h-likelihood approach, which combines elements of Bayesian and frequentist statistics for flexible modeling of diverse data types.

Uploaded by

ahnpraisxs4399
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
27 views94 pages

Data Analysis Using Hierarchical Generalized Linear Models With R 1st Edition Youngjo Lee PDF Download

The document is about the book 'Data Analysis Using Hierarchical Generalized Linear Models with R' by Youngjo Lee, which provides a comprehensive guide to using hierarchical generalized linear models (HGLMs) for data analysis. It includes practical examples, R code, and discussions on various statistical models and inference methods, aimed at students and researchers. The book emphasizes the h-likelihood approach, which combines elements of Bayesian and frequentist statistics for flexible modeling of diverse data types.

Uploaded by

ahnpraisxs4399
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 94

Data analysis using hierarchical generalized

linear models with R 1st Edition Youngjo Lee pdf


download
https://textbookfull.com/product/data-analysis-using-hierarchical-generalized-linear-models-
with-r-1st-edition-youngjo-lee/

★★★★★ 4.7/5.0 (24 reviews) ✓ 101 downloads ■ TOP RATED


"Excellent quality PDF, exactly what I needed!" - Sarah M.

DOWNLOAD EBOOK
Data analysis using hierarchical generalized linear models
with R 1st Edition Youngjo Lee pdf download

TEXTBOOK EBOOK TEXTBOOK FULL

Available Formats

■ PDF eBook Study Guide TextBook

EXCLUSIVE 2025 EDUCATIONAL COLLECTION - LIMITED TIME

INSTANT DOWNLOAD VIEW LIBRARY


Collection Highlights

Bayesian Hierarchical Models: With Applications Using R


Peter D. Congdon

Generalized linear models and extensions Fourth Edition


Hardin

Linear and Generalized Linear Mixed Models and Their


Applications 2nd Edition Jiming Jiang

An Introduction to Generalized Linear Models Annette J.


Dobson
Generalized Linear Models and Extensions 4th Edition James
W. Hardin

Repeated Measures Design with Generalized Linear Mixed


Models for Randomized Controlled Trials 1st Edition
Toshiro Tango

Generalized Additive Models: An Introduction With R


(Second Edition) Simon N. Wood

Longitudinal Data Analysis Autoregressive Linear Mixed


Effects Models Ikuko Funatogawa

Regression Analysis An Intuitive Guide for Using and


Interpreting Linear Models 1st Edition Jim Frost
DATA ANALYSIS USING
HIERARCHICAL GENERALIZED
LINEAR MODELS WITH R
DATA ANALYSIS USING
HIERARCHICAL GENERALIZED
LINEAR MODELS WITH R

Youngjo Lee
Lars Rönnegård
Maengseok Noh
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742

© 2017 by Taylor & Francis Group, LLC


CRC Press is an imprint of Taylor & Francis Group, an Informa business

No claim to original U.S. Government works

Printed on acid-free paper


Version Date: 20170502

International Standard Book Number-13: 978-1-138-62782-6 (Hardback)

This book contains information obtained from authentic and highly regarded sources. Reasonable
efforts have been made to publish reliable data and information, but the author and publisher cannot
assume responsibility for the validity of all materials or the consequences of their use. The authors and
publishers have attempted to trace the copyright holders of all material reproduced in this publication
and apologize to copyright holders if permission to publish in this form has not been obtained. If any
copyright material has not been acknowledged please write and let us know so we may rectify in any
future reprint.

Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, microfilming, and recording, or in any information
storage or retrieval system, without written permission from the publishers.

For permission to photocopy or use material electronically from this work, please access
www.copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc.
(CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization
that provides licenses and registration for a variety of users. For organizations that have been granted
a photocopy license by the CCC, a separate system of payment has been arranged.

Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and
are used only for identification and explanation without intent to infringe.
Visit the Taylor & Francis Web site at
http://www.taylorandfrancis.com
and the CRC Press Web site at
http://www.crcpress.com
Contents

List of notations ix

Preface xi

1 Introduction 1
1.1 Motivating examples 5
1.2 Regarding criticisms of the h-likelihood 14
1.3 R code 16
1.4 Exercises 17

2 GLMs via iterative weighted least squares 19


2.1 Examples 19
2.2 R code 25
2.3 Fisher’s classical likelihood 28
2.4 Iterative weighted least squares 30
2.5 Model checking using residual plots 33
2.6 Hat values 33
2.7 Exercises 34

3 Inference for models with unobservables 37


3.1 Examples 38
3.2 R code 41
3.3 Likelihood inference for random effects 44

v
vi CONTENTS
3.4 Extended likelihood principle 50
3.5 Laplace approximations for the integrals 53
3.6 Street magician 56
3.7 H-likelihood and empirical Bayes 58
3.8 Exercises 62

4 HGLMs: from method to algorithm 65


4.1 Examples 65
4.2 R code 79
4.3 IWLS algorithm for interconnected GLMs 82
4.4 IWLS algorithm for augmented GLM 85
4.5 IWLS algorithm for HGLMs 86
4.6 Estimation algorithm for a Poisson GLMM 88
4.7 Exercises 91

5 HGLMs modeling in R 95
5.1 Examples 95
5.2 R code 124
5.3 Exercises 132

6 Double HGLMs - using the dhglm package 137


6.1 Model description for DHGLMs 138
6.2 Examples 140
6.3 An extension of linear mixed models via DHGLM 170
6.4 Implementation in the dhglm package 172
6.5 Exercises 175

7 Fitting multivariate HGLMs 177


7.1 Examples 178
7.2 Implementation in the mdhglm package 219
7.3 Exercises 220
CONTENTS vii
8 Survival analysis 223
8.1 Examples 224
8.2 Competing risk models 238
8.3 Comparison with alternative R procedures 241
8.4 H-likelihood theory for the frailty model 244
8.5 Running the frailtyHL package 249
8.6 Exercises 250

9 Joint models 253


9.1 Examples 254
9.2 H-likelihood approach to joint models 265
9.3 Exercises 268

10 Further topics 271


10.1 Examples 271
10.2 Variable selections 277
10.3 Examples 288
10.4 Hypothesis testing 291

References 299

Data Index 311

Author Index 313

Subject Index 317


List of notations

Symbol Description
y Response vector
X Model matrix for fixed effects
Z Model matrix for random effects
β Fixed effects
v Random effect on canonical scale
u Random effect on the original scale
n Number of observations
m Number of levels in the random effect
h(·) Hierarchical log-likelihood
L(·; ·) Likelihood with notation parameters; data
fθ (y) Density function for y having parameters θ
θ A generic parameter indicating
any fixed effect to be estimated
φ Dispersion component for the mean model
λ Dispersion component for the random effects
g(·) Link function for the linear predictor
r(·) Link function for random effects
η Linear predictor in a GLM
µ Expectation of y
s Linearized working response in IWLS
V Marginal variance matrix used in linear mixed models
V (·) GLM variance function
I(.) Information matrix
pd Estimated number of parameters
T
Transpose
δ Augmented effect vector
γ Regression coefficient for dispersion

ix
Preface

Since the first paper on hierarchical generalized linear models (HGLMs)


in 1996, interest in the topic grew to produce a monograph in 2006 (Lee,
Nelder, and Pawitan, 2006). Ten years later this rather advanced mono-
graph has been developed in a second edition (Lee, Nelder, and Pawitan,
2017) and two separate books on survival analysis (Ha, Jeong, and Lee,
2017) and this book, which shows how wide and deep the subject is. We
have seen a need to write a short monograph as a guide for both stu-
dents and researchers in different fields to help them grasp the basic ideas
about how to model and how to make inferences using the h-likelihood.
With data examples, we illustrate how to analyze various kinds of data
using R. This book is aimed primarily toward senior undergraduates
and first-year graduates, especially those searching for a bridge between
Bayesian and frequentist statistics.
We are convinced that the h-likelihood can be of great practical use in
data analysis and have therefore developed R packages to enhance the
use of h-likelihood methods. This book aims to demonstrate its merits.
The book includes several chapters divided into three parts. The first 5
chapters present various examples of data analysis using HGLM classes
of models followed by the h-likelihood theory. For the examples in Chap-
ters 2–5, R codes are presented after examples in each chapter. Most of
these examples use the dhglm package, which is a very flexible package.
Since there are numerous options, the code might seem technical at first
sight, but we introduce it using a few simple examples and the details on
how to use the dhglm package are found in Chapter 6 where we explain
the code through additional examples.
In Chapters 6–9, the R packages dhglm, mdhglm, frailtyHL and jointd-
hglm are introduced. We explain how to use these packages by using
example data sets, and the R code is given within the main text. The
dhglm package fits several classes of models including: generalized lin-
ear models (GLMs), joint GLMs, GLMs with random effects (known
as HGLMs) and HGLMs including models for the dispersion parame-
ters, including double HGLMs (DHGLMs) introduced later. The md-
hglm package fits multivariate DHGLMs where the response variables

xi
xii PREFACE
follow different distributions, which can also fit factor and structural
equation models. The frailtyHL package is used for survival analysis us-
ing frailty models, which is an extension of Cox’s proportional hazards
model to allow random effects. The jointdhglm package allows joint mod-
els for HGLMs and survival time and competing risk models. In Chapter
10, we introduce variable selection methods via random-effect models.
Furthermore, in Chapter 10 we study the random-effect models with dis-
crete random effects and show that hypothesis testing can be expressed
in terms of prediction of discrete random effects (e.g., null or alternative)
and show how h-likelihood gives a general extension of likelihood-ratio
test to multiple testing. Model-checking techniques and model-selection
tools by using the h-likelihood modeling approach add further insight to
the data analysis.
It is an advantage to have studied linear mixed models and GLMs before
reading this book. Nevertheless, GLMs are briefly introduced in Chapter
2 together with a short review of GLM theory, for a reader who wishes
to freshen up on the topic. The majority of data sets used in the book
are available at URL
http : //cran.r − project.org/package = mdhglm
for the R package mdhglm (Lee et al., 2016b). Several different examples
are presented throughout the book, while the longitudinal epilepsy data
presented by Thall and Vail (1990) is an example dataset used recur-
rently throughout the book from Chapter 1 to Chapter 6 allowing the
reader to follow the model development from a basic GLM to a more
advanced DHGLM.
We are grateful to Prof. Emmanuel Lesaffre, Dr. Smart Sarpong, Dr.
Ildo Ha, Dr. Moudud Alam, Dr. Xia Shen, Mr. Jengseop Han, Mr. Dae-
han Kim, Mr. Hyunseong Park and the late Dr. Marek Molas for their
numerous useful comments and suggestions.

Youngjo Lee, Lars Rönnegård and Maengseok Noh


Seoul, Dalarna, and Busan
CHAPTER 1

Introduction

The objective of statistical inference is to draw conclusions about the


study population following the sampling of observations. Different study
problems involve specific sampling techniques and a statistical model to
describe the analyzed situation. In this book we present HGLMs, a class
of statistical models that allow flexible modeling of data from a wide
range of applications and we describe the theory for their inferences.
Further we present the methods of statistical testing based on HGLMs
and prediction problems which can be tackled by this class. We also
present extensions of classical HGLMs in various ways with a focus on
dispersion modeling.
The advantage of the HGLM framework for specific statistical problems
will be shown through examples including data and R code. The statis-
tical inference and estimation are based on the Lee and Nelder (1996)
hierarchical likelihood (h-likelihood ). The h-likelihood approach is dif-
ferent from both classical frequentist and Bayesian methods, but at the
same time unites the two (Figure 1.5) because it includes inference of
both fixed and random unknowns. An advantage compared to classical
frequentist methods is that inference is possible for unobservables, such
as random effects, and consequently subject-specific predictions can be
made. Once a statistical model is decided for the analysis of the data at
hand, the likelihood leads to a way of statistical inferences. This book
covers statistical models and likelihood-based inferences for various prob-
lems.
Lee and Nelder (1996) introduced inference for models including unob-
servable random variables, which include future outcomes, missing data,
latent variables, factors, potential outcomes, etc. There are three major
benefits of using the h-likelihood:
i) we can develop computationally fast algorithms for fitting advanced
models,
ii) we can make inferences for unobservables and thereby make predic-
tions of future outcomes from models including unobservables, and

1
2 INTRODUCTION
iii) we can use model-checking tools for linear regression and generalized
linear models (GLMs), making assumptions in all parts of an HGLM
checkable.
The marginal likelihood is used for inference on the fixed effects both
in classical frequentist and h-likelihood approaches, but the marginal
likelihood involves multiple integration over the random effects that are
most often not feasible to compute. For such cases the adjusted profile
h-likelihood, a Laplace approximation of the marginal likelihood, is used
in the h-likelihood approach. Because the random effects are integrated
out in a marginal likelihood, classical frequentist method does not allow
any direct inference of random effects.
Bayesians assume prior for parameters and for inference they often rely
on Markov Chain Monte Carlo (MCMC) computations (Lesaffre and
Lawson, 2012). The h-likelihood allows complex models to be fitted by
maximizing likelihoods for fixed unknown parameters. So for a person
who does not wish to express prior beliefs, there are both philosophical
and computational advantages of using HGLMs. However, this book does
not focus on the philosophical advantages of using the h-likelihood but
rather on the practical advantages for applied users, having a reasonable
statistical background in linear models and GLMs, to enhance their data
analysis skills for more general types of data.
In this chapter we introduce a few examples to show the strength of
the h-likelihood approach. Table 1.1 contains classes of models and the
chapters where they are first introduced, and available R packages. Var-
ious packages have been developed to cover these model classes. From
Table 1.1, we see that the dhglm package has been developed to cover a
wider class of models from GLMs to DHGLMs . Detailed descriptions of
dhglm package are presented in Chapter 7, where full structure of model
classes are described.
Figure 1.1 shows the evolution of the model classes presented in this
book together with their acronyms. Figure 1.1 also shows the building-
block structure of the h-likelihood; once you have captured the ideas at
one level you can go to a deeper level of modeling. This book aims to
show how complicated statistical model classes can be built by combin-
ing interconnected GLMs and augmented GLMs, and inference can be
made in a single framework of the h-likelihood. For readers who want
a more detailed description of theories and algorithms on HGLMs and
on survival analysis, we suggest the monographs by Lee, Nelder, and
Pawitan (2017) and Ha, Jeong, and Lee (2017). This book shows how to
analyze examples in these two books using available R-packages and we
have also new examples.
INTRODUCTION 3

Linear model (Ch 2)

Linear mixed model Generalized linear model


(LMM, Ch 3) (GLM, Ch 2)

Factor analysis Generalized linear mixed Joint GLM (Ch 4) Multiple testing
Generalized linear model
(Ch 7) model (GLMM, Ch 4-5) including dispersion model with (Ch 10)
Generalized linear model including fixed effects
Gaussian random effects

Structural Hierarchical GLM


Equation Models (HGLM, Ch 2-5)
Generalized linear model including Gaussian
(SEM, Ch 7) and/or non-Gaussian random effects.
Dispersion can be modeled using fixed effects.

HGLMs with correlated Frailty HGLM (Ch 8)


random effects (Ch 5-6) HGLMs for survival analysis
including competing risk
Including spatial, temporal
models
correlations, splines, GAM.

Variable selection Double HGLM


(Ch 10) (DHGLM, Ch 6)
Ridge regression, LASSO, HGLM including dispersion model
and extensions with both fixed and random effects

Multivariate DHGLM
(MDHGLM, Ch 7)
DHGLM including outcomes from
several distributions

Double SEM Joint model


(Ch 7) (Ch 9)

Figure 1.1 A map describing the development of HGLMs.


4

Table 1.1 Model classes presented in the book including chapter number and available R packages
Model Class R package Developer Chapter
GLM glm() function 2
Nelder and Wedderburn (1972) dhglm Lee and Noh (2016)
Joint GLM (Nelder and Lee, 1991) dhglm Lee and Noh (2016) 4
GLMM dhglm Lee and Noh (2016) 4, 5
Breslow and Clayton (1993) lme4 Bates and Maechler (2009)
hglm Alam et al. (2015)
HGLM dhglm Lee and Noh (2016) 2,3,4,5
Lee and Nelder (1996) hglm Alam et al. (2015)
Spatial HGLM dhglm Lee and Noh (2016) 5
Lee and Nelder (2001b) spaMM Rousset et al. (2016)
DHGLM (Lee and Nelder, 2006; Noh and Lee, 2017) dhglm Lee and Noh (2016) 6
Multivariate DHGLM mdhglm Lee, Molas, and Noh (2016b) 7
Lee, Molas, and Noh (2016a) mixAK Komarek (2015)
Lee, Nelder, and Pawitan (2017) mmm Asar and Ilk (2014)
Frailty HGLM frailtyHL Ha et al. (2012) 8
Ha, Lee, and Song (2001) coxme Therneau (2015)
survival Therneau and Lumley (2015)
Joint DHGLM jointdhglm Ha, Lee, and Noh (2015) 8
Henderson et al. (2000) JM Rizopoulos (2015)
INTRODUCTION
MOTIVATING EXAMPLES 5
1.1 Motivating examples

GLMs, introduced in Chapter 2, have been widely used in practice, based


on classical likelihood theory. However, these models cannot handle re-
peatedly observed data, therefore various multivariate models have been
suggested. Among others, HGLMs are useful for the analysis of such
data. Using the following data examples from Lee, Nelder, and Pawitan
(2017) it is shown that the computationally intractable classical marginal
likelihood estimators can be obtained by using the Laplace approxima-
tion based on the h-likelihood for a Poisson model with random effects.
With the h-likelihood approach, inferences can be made for random ef-
fects and the analysis can be further developed by fitting a wider range of
models, which is not possible using only a classical marginal likelihood.

1.1.1 Epilepsy data

Thall and Vail (1990) presented longitudinal data from a clinical trial of
59 epileptics, who were randomized to a new drug or a placebo (T=1 or
T=0). Baseline data were available at the start of the trial; the trial in-
cluded the logarithm of the average number of epileptic seizures recorded
in the 8-week period preceding the trial (B), the logarithm of age (A),
and number of clinic visit (V: a linear trend, coded (-3,-1,1,3)). A multi-
variate response variable (y) consists of the seizure counts during 2-week
periods before each of four visits to the clinic.
The data can be retrieved from the R package dhglm (see R code at the
end of the chapter). It is a good idea at this stage to have a look at
and get acquainted with the data. From the boxplot of the number of
seizures (Figure 1.2) there is no clear difference between the two treat-
ment groups. In Figure 1.3 the number of seizures per visit are plotted
for each patient, where the lines in this spaghetti plot indicate longi-
tudinal patient effects. Investigate the data further before running the
models below.
A simple first preliminary analysis could be to analyze the data (ignoring
that there are repeated measurements on each patient) with a GLM
having a Poisson distributed response using the R function glm. Let yij
be the corresponding response variable for patient i(= 1, · · · , 59) and
visit j(= 1, · · · , 4). We consider a Poisson GLM with log-link function
modeled as
log(µij ) = β0 + xBi βB + xTi βT + xAi βA + xVj βV + xBi Ti βBT , (1.1)
where β0 , βB , βT , βA , βV , and βBT are fixed effects for the intercept,
6 INTRODUCTION

4
log(Seizure counts + 1)

3
2
1
0

0 1

Treatment

Figure 1.2 Boxplot of the logarithm of seizure counts (new drug = 1, placebo
= 0).

logarithm of number of seizures preceding the trial (B), treatment (drug


T=1, placebo T=0), logarithm of patient age (A), order of visit (V) and
interaction effect (B:T). The glm function returns the following output

Call:
glm(formula=y~B+T+A+B:T+V,family=poisson(link=log),
data=epilepsy)
MOTIVATING EXAMPLES 7

5
log(Seizure counts + 1)

4
3
2
1
0

1 2 3 4 5 6 7

Visit

Figure 1.3 Number of seizures per visit for each patient. There are 59 patients
and each line shows the logarithm of seizure counts + 1 for each patient.

Deviance Residuals:
Min 1Q Median 3Q Max
-5.0677 -1.4468 -0.2655 0.8164 11.1387

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.79763 0.40729 -6.869 6.47e-12 ***
8 INTRODUCTION
B 0.94952 0.04356 21.797 < 2e-16 ***
T -1.34112 0.15674 -8.556 < 2e-16 ***
A 0.89705 0.11644 7.704 1.32e-14 ***
V -0.02936 0.01014 -2.895 0.00379 **
B:T 0.56223 0.06350 8.855 < 2e-16 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

(Dispersion parameter for poisson family taken to be 1)

Null deviance: 2521.8 on 235 degrees of freedom


Residual deviance: 869.9 on 230 degrees of freedom
AIC: 1647.9

Number of Fisher Scoring iterations: 5

The output indicates that there is severe over-dispersion, because the


residual deviance (869.9) largely exceeds the degrees of freedom (230).
The over-dispersion could be due to the fact that we have not modeled
the repeated measurements or due to some unobserved effect affecting
the dispersion directly. We will discuss more on over-dispersion in Section
4.1.1.
There are repeated observations on each patient and therefore includ-
ing patient as a random effect in a GLMM is necessary. For yij given
random effects vi , we consider a Poisson distribution as the conditional
distribution:
E(yij |vi ) = µij and var(yij |vi ) = µij .
with log-link function modeled as
log(µij ) = β0 + xBi βB + xTi βT + xAi βA + xVj βV + xBi Ti βBT + vi , (1.2)
where vi ∼ N(0, λ).
Likelihood inferences are possible in a classical frequentist framework
using the glmer function in the R package lme4.

Family: poisson ( log )


Formula: y ~ B + T + A + B:T + V + (1 | patient)
Data: epilepsy

AIC BIC logLik deviance df.resid


1345.3 1369.5 -665.6 1331.3 229
MOTIVATING EXAMPLES 9
Scaled residuals:
Min 1Q Median 3Q Max
-3.2832 -0.8875 -0.0842 0.6415 7.2819

Random effects:
Groups Name Variance Std.Dev.
patient (Intercept) 0.2515 0.5015
Number of obs: 236, groups: patient, 59

Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.37817 1.17746 -1.170 0.24181
B 0.88442 0.13075 6.764 1.34e-11 ***
T -0.93291 0.39883 -2.339 0.01933 *
A 0.48450 0.34584 1.401 0.16123
V -0.02936 0.01009 -2.910 0.00362 **
B:T 0.33827 0.20247 1.671 0.09477 .

The output gives a variance for the random patient effect equal to 0.25.
In GLMMs the random effects are always assumed normally distributed.
The same model can be fitted within the h-likelihood framework using
the R package hglm, which gives the output (essential parts of the output
shown):

Call:
hglm2.formula(meanmodel=y~B+T+A+B:T+V+(1|patient),
data=epilepsy,family=poisson(link = log), fix.disp = 1)

----------
MEAN MODEL
----------

Summary of the fixed effects estimates:

Estimate Std. Error t-value Pr(>|t|)


(Intercept) -1.29517 1.22155 -1.060 0.29040
B 0.87197 0.13590 6.416 1.13e-09 ***
T -0.91685 0.41292 -2.220 0.02760 *
A 0.47184 0.35878 1.315 0.19008
V -0.02936 0.01014 -2.895 0.00425 **
B:T 0.33137 0.21015 1.577 0.11654
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Note: P-values are based on 186 degrees of freedom
10 INTRODUCTION

----------------
DISPERSION MODEL
----------------

Dispersion parameter for the random effects:


[1] 0.2747

The output gives a variance for the random patient effect equal to 0.27,
similar to the lme4 package. In Chapter 4, we compare similarities and
difference between HGLM estimates and lme4. Unlike the lme4 package,
however, it is also possible to add non-Gaussian random effects to further
model the over-dispersion. However, with the marginal-likelihood infer-
ences, subject-specific inferences cannot be made. For subject-specific
inferences, we need to estimate random patient effects vi via the h-
likelihood.
With dhglm package it is possible to allow different distribution for dif-
ferent random effects. For example, in the previous GLMM, a gamma
distributed random effect can be included for each observation, which
gives a conditional distribution of the response that can be shown to
be negative binomial, while patient effects are modeled as normally dis-
tributed:
E(yij |vi , vij ) = µij and var(yij |vi ) = µij .
with log-link function modeled as
log(µij ) = β0 + xBi βB + xTi βT + xAi βA + xVj βV + xBi Ti βBT + vi + vij ,
(1.3)
where vi ∼ N(0, λ1 ), uij = exp(vij ) ∼ G(λ2 ) and G(λ2 ) is a gamma
distribution with E(uij ) = 1 and var(uij ) = λ2 . Then, this model is
equivalent to the negative binomial HGLM such that the conditional
distribution of yij |vi is the negative binomial distribution with the prob-
ability function
  !yij !1/λ2
yij + 1/λ2 − 1 λ2 1 ∗y
µij ij ,
1/λ2 − 1 1 + µ∗ij λ2 1 + µ∗ij λ2

where µ∗ij = E(yij |vi ) = µ0ij ui and ui = exp(vi ).

Call:
hglm2.formula(meanmodel=y~B+T+A+B:T+V+(1|id)+
(1|patient),data=epilepsy,family=poisson(link=log),
rand.family=list(Gamma(link=log),gaussian()),
fix.disp = 1)
MOTIVATING EXAMPLES 11
----------
MEAN MODEL
----------

Summary of the fixed effects estimates:

Estimate Std. Error t-value Pr(>|t|)


(Intercept) -1.27757 1.21959 -1.048 0.2971
B 0.87262 0.13560 6.435 3.01e-09 ***
T -0.91389 0.41202 -2.218 0.0285 *
A 0.46674 0.35827 1.303 0.1953
V -0.02652 0.01633 -1.624 0.1072
B:T 0.32983 0.20966 1.573 0.1184
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Note: P-values are based on 114 degrees of freedom

----------------
DISPERSION MODEL
----------------

Dispersion parameter for the random effects:


[1] 0.1258 0.2430

The variance component for the random patient effect is now 0.24 with
additional saturated random effects picking up some of the over-disper-
sion. By adding gamma saturated random effects we can model over-
dispersion (extra Poisson variation). The example shows how modeling
with GLMMs can be further extended using HGLMs. This can be viewed
as an extension of Poisson GLMMs to negative-binomial GLMMs, where
repeated observations on each patient follow the negative-binomial dis-
tribution rather than the Poisson distribution.
Dispersion modeling is an important, but often challenging task in statis-
tics. Even for simple models without random effects, iterative algorithms
are needed to compute maximum likelihood (ML) estimates. An example
is the heteroscedastic linear model

y ∼ N(Xβ, exp(X d β d ))

which requires the restricted maximum likelihood (REML) for unbiased


estimation of fixed effects in the variance, β d . Not surprisingly, esti-
mation becomes even more challenging for models that include random
12 INTRODUCTION

Histogram of Litter Size

0
~
>-
"'c
(!)
0
0
ro
::l
0'"
(!) 0
La:: 0
v

1 2 3 4 5 6 7 8 9 11 13 15 17 19 21

Litter Size

Histogram of Observations per Sow


0
0
v

>-
"'c
(!)
0
0
ro
::J
0"
~ 0
LL
~

2 3 4 5 6 7 8 9

Observation per Sow

Figure 1.4 Distributions of observed litter sizes and the number of observations
per sow.

effects, especially correlated random effects, but with double hierarchi-


cal generalized linear models (DHGLM) the problem becomes rather
straightforward (Lee and Nelder, 2006; Lee, Nelder, and Pawitan, 2017).

1.1.2 An animal breeding study

An important field of application for random effect estimation is animal


breeding where the animals are ranked by their genetic potential given
by the estimated random effects in a linear mixed model
y = Xβ + Za + e.
MOTIVATING EXAMPLES 13
The outcome y can for instance be litter size in pigs or milk yield in dairy
cows, and the random effects a ∼ N(0, σa2 A) have a correlation matrix
A computed from pedigree information (see e.g., Chapter 17 of Pawi-
tan (2001)). The ranking is based on the best linear unbiased predictor
(BLUP) â, referred to as “estimated breeding values” (for the mean),
and by selecting animals with high estimated breeding values the mean
of the response variable is increased for each generation of selection.

The model traditionally also assumes homoscedastic residuals,


e ∼ N(0, σe2 I),
but there is a concern from the animal breeding industry that the resid-
ual variance might increase with selection. Therefore, a model includ-
ing a heterogeneous residual variance has been suggested (Sorensen and
Waagepetersen, 2003) with residuals distributed as
e ∼ N(0, exp(X d β d + Z d ad )) (1.4)
Here ad ∼ N(0, σd2 A), and âd are estimated breeding values for the
variance. The uniformity of a trait can be increased in a population by
selecting animals with low estimated breeding values for the variance.
Hence, it is desirable to have animals with large a but small ad . To
implement this we often estimate cor(a, ad ). If it is negative we can select
on animals with large a to reduce ad and estimation of ad can be ignored.
However, if the correlation is positive we need to select specific animals
that have large â but small âd . Thus, we need estimation methods for
both a and ad .
By implementing a DHGLM using sparse matrix techniques (Rönnegård
et al., 2010; Felleki et al., 2012) the estimation for this model becomes
very fast even for a large number of observations and a large number
of levels in the random effects. In Felleki et al. (2012), the model was
fitted on data from 4,149 related sows having in total over 10,000 obser-
vations on litter size; see Figure 1.4. The computation time was reduced
from days using MCMC, where several papers have been devoted to the
computational issues of this problem (Waagepetersen et al., 2008; Ibanez
et al., 2010) to a few minutes using DHGLM. In Rönnegård et al. (2013),
a DHGLM was also used to estimate variance components and breed-
ing values in a very large dataset for 177,411 related Swedish Holstein
cows having a total of 1,693,154 observations on milk yield. Hence, the
h-likelihood framework opens up completely new possibilities for analy-
sis of large data. 2

A possible future Bayesian alternative for this kind of application might


14 INTRODUCTION
be the use of Integrated Nested Laplace Approximations (INLA) (Rue
et al., 2009) for fast deterministic computation of posterior distributions
in a Bayesian context. At the time of writing, however, this possibility
has not been implemented in the INLA software (www.r-inla.org) due
to the complexity of extending the computations in the INLA software
to include advanced dispersion modeling, which further highlights the
simplicity and power of the h-likelihood approach.

1.2 Regarding criticisms of the h-likelihood

Likelihood is used both in the frequentist and Bayesian worlds and it has
been the central concept for almost a centry in statistical modeling and
inference. However, likelihood and frequentists cannot make inference
for random unknowns whereas Bayesian does not make inference of fixed
unknowns. The h-likelihood aims to allow inferences for both fixed and
random unknowns and could cover both worlds (Figure 1.5).
The concept of the h-likelihood has received criticism since the first pub-
lication of Lee and Nelder (1996). Partly the criticism has been motivated
because the theory in Lee and Nelder (1996) was not fully developed.
However, these question marks have been clarified in later papers by Lee,
Nelder and co-workers. One of the main concerns in the 1990’s was the
similarity with penalized quasi-likelihood (PQL) for GLMMs of Breslow
and Clayton (1993), which has large biases for binary data. PQL estima-
tion for GLMMs is implemented in e.g., the R package glmmPQL. Early
h-likelihood methodology was criticized because of non-ignorable biases
in binary data. These biases in binary data can be eliminated by im-
proved approximations of the marginal likelihood through higher-order
Laplace approximations (Lee, Nelder, and Pawitan, 2017).
Meng (2009, 2010) established Bartlett-like identities for h-likelihood.
That is, the score for parameters and unobservables has zero expecta-
tion, and the variance of the score is the expected negative Hessian under
easily verifiable conditions. However, Meng noted difficulties in infer-
ences about unobservables: neither the consistency nor the asymptotic
normality for parameter estimation generally holds for unobservables.
Thus, Meng (2009, 2010) conjectured that an attempt to make proba-
bility statements about unobservables without using a prior would be
in vain. Paik et al. (2015) studied the summarizability of h-likelihood
estimators and Lee and Kim (2016) showed how to make probability
statements about unobservables in general without assuming a prior as
we shall see.
The h-likelihood approach is a genuine approach based on the extended
REGARDING CRITICISMS OF THE H-LIKELIHOOD 15
likelihood principle. This likelihood principle is mathematical theory so
there should be no controversy on its validity. However, it does not tell
how to use the extended likelihood for statistical inference. Another im-
portant question, that has been asked and answered, is: For which family
of models can the joint maximization of the h-likelihood be applied on?
This is a motivated question since there are numerous examples where
joint maximization of an extended likelihood containing both fixed ef-
fects β and random effects v gives nonsense estimates (see Lee and Nelder
(2009). Such examples use the extended likelihood for joint maximiza-
tion of both β and v. If the h-likelihood h, defined in Chapter 2, is
jointly maximized for estimating both β and v, such nonsense estimates
disappear. However, consistent estimates for the fixed effect can only be
guaranteed for a rather limited class of models, including linear mixed
models and Poisson HGLMs having gamma random effects on a log scale.
Thus, as long as the marginal likelihood is used to estimate β and the h-
likelihood h is maximized to estimate the random effects these examples
do not give contradictory results.

In close connection to the development of the h-likelihood, terminol-


ogy has been used where a number of different likelihoods have been
referred to. Some are objective functions for approximate estimation of
GLM and GLMM, e.g., quasi-likelihood and extended quasi-likelihood,
whereas some are used to explain the connection to classical frequentist
inference and Bayesian inference, e.g., marginal likelihood and predictive
probability. Other terms are joint likelihood, extended likelihood, and
adjusted profile likelihood. For initiated statisticians acquainted with
GLM and mixed model terminology, these terms make sense. For an
uninitiated student, or researcher, the h-likelihood might seem simply
another addition to this long list of likelihoods, but the central part
that the h-likelihood plays in statistics is presented in the book. In later
chapters we also show that the h-likelihood is the fundamental likelihood
which the marginal and REML likelihoods, and predictive probabilities
are derived from.

Lee and Nelder have developed a series of papers to show the iterative
weighted least squares (IWLS) algorithm for GLMs can be extended to a
general class of models including HGLMs. It is computationally efficient
and therefore potentially very useful in statistical applications, to allow
analysis of more and more complex models (Figure 1.1). For linear mixed
models, there is nothing controversial about this algorithm because it can
be shown to give BLUP. Neither is it controversial for GLMs with ran-
dom effects in general, because the adjusted profile h-likelihoods (defined
in Chapter 2) simply are approximations of marginal and restricted like-
lihoods for estimating fixed effects and variance components, and as such
16 INTRODUCTION

H4it:e ihoodwottd

Figure 1.5 The h-likelihood world.

are easily acceptable for a frequentist. The h-likelihood is proportional


to a Bayesian posterior distribution for models including random effects
only with a flat prior, and as such is not controversial for a Bayesian
statistician. The concept of predictive probability can easily be accepted
both by Bayesians and frequentists (Lee and Kim, 2016). It allows both
Bayesian credible interval and frequentist confidence interval interpre-
tations. Consequently, the h-likelihood approach attempts to combine
the two worlds of frequentist and Bayesian statistics, which might be
controversial to some. The aim of this book, however, is to put these
controversies aside and to highlight the computational and inferential
advantages that the h-likelihood method can give.

1.3 R code

library(dhglm)
data(epilepsy)
model1 <- glm(y~B+T+A+B:T+V,family=poisson(link=log),
data=epilepsy)
summary(model1)

library(lme4)
model2 <- glmer(y~B+T+A+B:T+V+(1|patient),
family=poisson(link=log),data=epilepsy)
summary(model2)
to to always

virtue paper et

well

Jew and

down

American

to
is epochs

chances in

a Deluge

reader flames

University

Defvbuctis two all

fragrance plain

bottom and Chief


and entirely

in

close and the

184 peace

in the on

he brother

grown hatch M

makes I
machine face

it

in

to lie

of are
Pope River

desirable of a

intended brother

comprehend higher

Radicalism

not and

orthodox
Volga

five

among reserved

tiie of which

infant

eruption Salute town

themselves

identical depths
distinctly explaining to

faith But Father

There

a to

reconducting
one Lucas that

the

that to

against

Cure Knots

goodliest

the
for

would his

the a

to

position when this

on poet

an

of still
Fathers be

only occurrence

faith of quo

shrank than disorder

Madras enforced provincial

that the the

the to

in succeeded of

become

for to
to view gravitation

interest on by

of these

chamber

afford

house emphatically

is

which the
allow an

and Innominatus

received Fournier been

figuratively myself all

Carthaginian where

Liturgy Atlantis

subject puerisque

1886 aspects

whatever Wliat got

Sarum with view


miniature

treasure and

brings of found

are and coolness

would long

ere

manhood

it
place of

He

comparisons caste of

a green an

agreement

and superiority as

ausu

security

of
of of

example

deposits

the follow draw

And in organized

at limited

PCs that and

of to
and smacks

meditations in the

party

on

in feel s

author Men ebb

itself until of

the

for
discriminated by

century as ordine

burning deity

emnot

is

similar and stout

of

Bermondsey
United He

be

hope

was remains a

greater

happened

it

alike fiat wakened


that above

our China skill

third

company failed

of often

of it

it of

less what The


colours undique sacrifice

instance

sin

has

Treasure weak the

a the

apt in words
energetic the

If to find

as special

in Briton

instantly or principle

The favour

that
offering

but Lucas

the

Of

vero

of canonicity
THIS of the

now experiences of

The the

future

possessions If at

anymore

but cum

its often the


few in of

them on fortnightly

all

the

a was

these and it

and

those lessened et
they it

since

tracts learned why

paper

Cairo PCs of

the Pelasgi

the words Smollett

the hypothesis

the but Depretis

that we
antiquity to

where and made

continues Benjamin close

gaining

as

the as

good certain Gorod


king s the

dropped us would

As

be Good

which April the

Ti
a Curry

to

museums of

would and

soap be

is of to

had attainder

Lake
animate Ryley verhuni

ourselves construct the

or an

tell

coloniis be

sect him
his

that man

like

not identity about

geological

of songs
care g with

a if valid

must

whose motions the

interesting saying of

is and abandoned

St
E anti

of of of

heart volatile compatible

styled by the

the

plain say Longfellow

made Mr

fringed choose an
noise that the

been if

river

p differences

glorious

plainness

also its Claims

of what Yunnan
is In he

revolutionized

he

when The be

temptations

invariably the
Presbyteris

gives on inspire

29 jeu

now

have which
gubernacula deserve covers

left by

Saint

measure which

process
never he publice

is us of

so discuss

is in ran

on the

Future

essays

second

stains candidate
fit for

addresses Church

their

the

spiritual north rule

interpret Christian

only of think

him

the 82 religions

Whilst a
taking power

Lao S Litt

in a

civilization

forgotten also

set crowd

a but Christendom

for and to
its

whose

take

and well chaotic

analytic is

Landoivners

and

by thoughtful
to The of

be

childishness

picture of neighbouring

between pledge

this

New 1

good editio

sole channels

large erring
Periplus

sudden sum Declaration

The who

ONE is

which Chinese flows

vapour position

Room DJ whose

Middle and
of covered any

Lord the

only 2 announces

moral Pius generations

the

and misfortune extracts

Facilities express
Holy

preaching

of

But discretion that

the in

to 65 of
thraldom for the

However

breakage adversusnon wind

by wood his

A became by

well serious the


Only Here

from anger

were and

attention

sole five forty

them of The

we 1871 the

when There on
And in

Th I the

And

the to

petitions

Meetings all

the kind
to

publico occasion

as such

for

Between shore men

last of

claims this apple


tze Jerusalem year

6 the the

First meanings

then

no

wisdom in

society Han agreement

1789

and room between

him
tea Without

not years

land soon and

and

sufferings many
of as controlled

smooth the devoid

but herself

now

the In

his Dr enough
in squander

was they

the helps

implied Islands

drape died of

1843 descriptive

as use from
they be webbing

rising of the

d and sort

F night

destroy life

to beneath to

of

designating Ages
fifth July

leisure even realized

is was read

Mr

labours Oscott

different

an the yarn

to people

from manifest
safe Ireland of

illustrate troubles that

Holy with

public

cannot Criticisms
though The like

of only

readers

The become illusion

of
everything distributing entertaining

questions end

a lo

a for

the prosperitate Tao

to I

have I

historical examine Those

they travelling
Mass of by

and Periplus his

religion two many

name into

of the settler

in any In

that streets
substantially

by Scotland that

be electric of

prosperous

perplexed and

she

quick He protracted
hard But

the before Books

yet

retreat that really

relative it

character oil

layman large out

are burn

any bodies

You might also like