Multilinear Subspace Learning Dimensionality Reduction of Multidimensional Data 1st Edition Haiping Lu download pdf
Multilinear Subspace Learning Dimensionality Reduction of Multidimensional Data 1st Edition Haiping Lu download pdf
https://ebookgate.com/product/hipparcos-the-new-reduction-of-the-raw-
data-astrophysics-and-space-science-library-1st-edition-floor-van-
leeuwen/
ebookgate.com
https://ebookgate.com/product/learning-qlikview-data-
visualization-1st-edition-karl-pover/
ebookgate.com
https://ebookgate.com/product/scientific-inference-learning-from-
data-1st-edition-simon-vaughan/
ebookgate.com
Data Mining and Machine Learning in Cybersecurity 1st
Edition Sumeet Dua
https://ebookgate.com/product/data-mining-and-machine-learning-in-
cybersecurity-1st-edition-sumeet-dua/
ebookgate.com
https://ebookgate.com/product/multidimensional-poverty-measurement-
and-analysis-1st-edition-sabina-alkire/
ebookgate.com
https://ebookgate.com/product/multidimensional-real-analysis-ii-
integration-1st-edition-j-j-duistermaat/
ebookgate.com
https://ebookgate.com/product/unsaturated-soil-mechanics-1st-edition-
ning-lu/
ebookgate.com
https://ebookgate.com/product/epigenetics-and-dermatology-1st-edition-
qianjin-lu/
ebookgate.com
Chapman & Hall/CRC
Machine Learning & Pattern Recognition Series
Multilinear
Subspace Learning
Dimensionality Reduction of
Multidimensional Data
Haiping Lu
Konstantinos N. Plataniotis
Anastasios N. Venetsanopoulos
Multilinear
Subspace Learning
Dimensionality Reduction of
Multidimensional Data
Chapman & Hall/CRC
Machine Learning & Pattern Recognition Series
SERIES EDITORS
This series reÀects the latest advances and applications in machine learning and pat-
tern recognition through the publication of a broad range of reference works, text-
books, and handbooks. The inclusion of concrete examples, applications, and meth-
ods is highly encouraged. The scope of the series includes, but is not limited to, titles
in the areas of machine learning, pattern recognition, computational intelligence,
robotics, computational/statistical learning theory, natural language processing,
computer vision, game AI, game theory, neural networks, computational neurosci-
ence, and other relevant topics, such as machine learning applied to bioinformatics
or cognitive science, which might be proposed by potential contributors.
PUBLISHED TITLES
Multilinear
Subspace Learning
Dimensionality Reduction of
Multidimensional Data
Haiping Lu
Konstantinos N. Plataniotis
Anastasios N. Venetsanopoulos
MATLAB® is a trademark of The MathWorks, Inc. and is used with permission. The MathWorks does
not warrant the accuracy of the text or exercises in this book. This book’s use or discussion of MAT-
LAB® software or related products does not constitute endorsement or sponsorship by The MathWorks
of a particular pedagogical approach or particular use of the MATLAB® software.
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
© 2014 by Taylor & Francis Group, LLC
CRC Press is an imprint of Taylor & Francis Group, an Informa business
This book contains information obtained from authentic and highly regarded sources. Reasonable
efforts have been made to publish reliable data and information, but the author and publisher cannot
assume responsibility for the validity of all materials or the consequences of their use. The authors and
publishers have attempted to trace the copyright holders of all material reproduced in this publication
and apologize to copyright holders if permission to publish in this form has not been obtained. If any
copyright material has not been acknowledged please write and let us know so we may rectify in any
future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, microfilming, and recording, or in any information stor-
age or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access www.copy-
right.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222
Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that pro-
vides licenses and registration for a variety of users. For organizations that have been granted a pho-
tocopy license by the CCC, a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are
used only for identification and explanation without intent to infringe.
Visit the Taylor & Francis Web site at
http://www.taylorandfrancis.com
and the CRC Press Web site at
http://www.crcpress.com
To Hongxia, Dailian, and Daizhen
To Ilda
Contents
Preface xxv
1 Introduction 1
1.1 Tensor Representation of Multidimensional Data . . . . . . . 2
1.2 Dimensionality Reduction via Subspace Learning . . . . . . . 5
1.3 Multilinear Mapping for Subspace Learning . . . . . . . . . . 9
1.4 Roadmap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
vii
viii Contents
Bibliography 231
Index 263
List of Figures
xiii
xiv List of Figures
8.1 Multilinear ICA, CCA, and PLS algorithms under the MSL
framework. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
8.2 The structured data in (a) are all mixtures generated from
the source data in (b) with a multilinear mixing model. . . . 167
8.3 Blind source separation by MMICA on synthetic data. . . . 172
8.4 Schematic of multilinear CCA for paired (second-order) tensor
datasets with two architectures. . . . . . . . . . . . . . . . . 178
2.1 PCA, LDA, CCA and PLS can all be viewed as solving the
generalized eigenvalue problem Av = λBv. . . . . . . . . . 39
B.1 Characteristics of the gait data from the USF Gait Challenge
datasets version 1.7. . . . . . . . . . . . . . . . . . . . . . . 224
xvii
List of Algorithms
xix
Acronyms and Symbols
Acronym Description
AdaBoost Adaptive boosting
ALS Alternating least squares
APP Alternating partial projections
BSS Blind source separation
CANDECOMP Canonical decomposition
CCA Canonical correlation analysis
CRR Correct recognition rate
DATER Discriminant analysis with tensor representation
EMP Elementary multilinear projection
FPT Full projection truncation
GPCA Generalized PCA
GTDA General tensor discriminant analysis
HOPLS Higher-order PLS
HOSVD High-order SVD
IC Independent component
ICA Independent component analysis
LDA Linear discriminant analysis
LSL Linear subspace learning
MCCA Multilinear CCA
MMICA Multilinear modewise ICA
MPCA Multilinear PCA
MSL Multilinear subspace learning
NIPALS Nonlinear iterative partial least squares
NMF Nonnegative matrix factorization
N -PLS N -way PLS
NTF Nonnegative tensor factorization
PARAFAC Parallel factors
PC Principal component
PCA Principle component analysis
PLS Partial least squares
R-UMLDA Regularized UMLDA
R-UMLDA-A Regularized UMLDA with aggregation
SMT Sequential mode truncation
SSS Small sample size
SVD Singular value decomposition
SVM Support vector machine
TR1DA Tensor rank-one discriminant analysis
TROD Tensor rank-one decomposition
xxi
xxii Acronyms and Symbols
Acronym Description
TTP Tensor-to-tensor projection
TVP Tensor-to-vector projection
UMLDA Uncorrelated multilinear discriminant analysis
UMPCA Uncorrelated MPCA
VVP Vector-to-vector projection
Symbol Description
|A| Determinant of matrix A Jn Mode-n dimension for the
· F Frobenius norm second set in CCA/PLS ex-
a or A A scalar tensions
a A vector K Maximum number of itera-
A A matrix tions
A A tensor k Iteration step index
Ā or Ā The mean of samples {Am } L Number of training samples
or {Am } for each class
AT Transpose of matrix A M Number of training samples
A−1 Inverse of matrix A m Index of training sample
A(i1 , i2 ) Entry at the i1 th row and Mc Number of training samples
i2 th column of A in class c
A(n) Mode-n unfolding of tensor
N Order of a tensor, number of
A
indices/modes
< A, B >Scalar product of A and B
n Mode index of a tensor
A ×n U Mode-n product of A by U
P Dimension of the output
a◦b Outer (tensor) product of a
vector, also number of EMPs
and b
in a TVP, or number of la-
C Number of classes
tent factors in PLS
c Class index
cm Class label for the mth Pn Mode-n dimension in the
training sample, the mth el- projected (output) space of
ement of the class vector c a TTP
δpq Kronecker delta, δpq = 1 iff p Index of the output vector,
p = q and 0 otherwise also index of the EMP in a
∂f (x)
Partial derivative of f with TVP, or index of latent fac-
∂x
respect to x tor in PLS
gp The pth coordinate vector ΨB Between-class scatter (mea-
g pm gp (m), the mth element of sure)
gp , see ymp ΨT Total scatter (measure)
Hy Number of selected features ΨW Within-class scatter (mea-
in MSL sure)
I An identity matrix Q Ratio of total scatter kept in
In Mode-n dimension or mode- each mode
n dimension for the first set R The set of real numbers
in CCA/PLS extensions rm The (TVP) projection of
Acronyms and Symbols xxiii
xxv
xxvi Preface
recent prevalence of big data applications has increased the demand for tech-
nical developments in this emerging research field. Thus, we found that there
is a strong need for a new book devoted to the fundamentals and foundations
of MSL, as well as MSL algorithms and their applications.
The primary goal of this book is to give a comprehensive introduction to
both theoretical and practical aspects of MSL for dimensionality reduction of
multidimensional data. It expects not only to detail recent advances in MSL,
but also to trace the history and explore future developments and emerging
applications. In particular, the emphasis is on the fundamental concepts and
system-level perspectives. This book provides a foundation upon which we can
build solutions for many of today’s most interesting and challenging problems
in big multidimensional data processing. Specifically, it includes the follow-
ing important topics in MSL: multilinear algebra fundamentals, multilinear
projections, MSL framework formulation, MSL optimality criterion construc-
tion, and MSL algorithms, solutions, and applications. The MSL framework
enables us to develop MSL algorithms systematically with various optimality
criteria. Under this unifying MSL framework, a number of MSL algorithms
are discussed and analyzed in detail. This book covers their applications in
various fields, and provides their pseudocodes and implementation tips to help
practitioners in further development, evaluation, and application. MATLAB R
or
http://www.dsp.toronto.edu/~haiping/MSL.html
or
https://sites.google.com/site/tensormsl/
We will update these websites with open source software, possible corrections,
and any other useful materials to distribute after publication of this book.
The authors would like to thank the Edward S. Rogers Sr. Department
of Electrical and Computer Engineering, University of Toronto, for support-
ing this research work. H. Lu would like to thank the Institute for Infocomm
Research, the Agency for Science, Technology and Research (A*STAR), in
particular, How-Lung Eng, Cuntai Guan, Joo-Hwee Lim, and Yiqun Li, for
hosting him for almost four years. H. Lu would also like to thank the De-
partment of Computer Science, Hong Kong Baptist University, in particular,
Pong C. Yuen, and Jiming Liu for supporting this work. We thank Dimitrios
Hatzinakos, Raymond H. Kwong, and Emil M. Petriu for their help in our
work on this topic. We thank Kar-Ann Toh, Constantine Kotropoulos, An-
drew Teoh, and Althea Liang for reading through the draft and offering useful
comments and suggestions. We would also like to thank the many anonymous
reviewers of our papers who have given us tremendous help in advancing this
field. This book would not have been possible without the contributions from
other researchers in this field. In particular, we want to thank the following
researchers whose works have been particularly inspiring and helpful to us:
Lieven De Lathauwer, Tamara G. Kolda, Amnon Shashua, Jian Yang, Jieping
Ye, Xiaofei He, Deng Cai, Dacheng Tao, Shuicheng Yan, Dong Xu, and Xue-
long Li. We also thank editor Randi Cohen and the staff at CRC Press, Taylor
& Francis Group, for their support during the writing of this book.
Haiping Lu
Hong Kong
Konstantinos N. Plataniotis
Anastasios N. Venetsanopoulos
Toronto
For MATLAB
R product information, please contact:
With the advances in sensor, storage, and networking technologies, bigger and
bigger data are being generated daily in a wide range of applications. Figures
1.1 through 1.4 show some examples in computer vision, audio processing,
neuroscience, remote sensing, and data mining. To succeed in this era of big
data [Howe et al., 2008], it becomes more and more important to learn compact
features for efficient processing. Most big data are multidimensional and they
can often be represented as multidimensional arrays, which are referred to
as tensors in mathematics [Kolda and Bader, 2009]. Thus, tensor-based com-
putation is emerging, especially with the growth of mobile Internet [Lenhart
et al., 2010], cloud computing [Armbrust et al., 2010], and big data such as
the MapReduce model [Dean and Ghemawat, 2008; Kang et al., 2012].
This book deals with tensor-based learning of compact features from mul-
tidimensional data. In particular, we focus on multilinear subspace learning
(MSL) [Lu et al., 2011], a dimensionality reduction [Burges, 2010] method de-
veloped for tensor data. The objective of MSL is to learn a direct mapping from
high-dimensional tensor representations to low-dimensional vector/tensor rep-
resentations.
1
2 Multilinear Subspace Learning
other data as well) can be represented as a third-order tensor where the first
two modes are column and row, and the third mode indexes different features
such that tensor is used as a feature combination/fusion scheme. For example,
local descriptors such as the Scale-Invariant Feature Transform (SIFT) [Lowe,
2004] and Histogram of Oriented Gradients (HOG) [Dalal and Triggs, 2005]
form a local descriptor tensor in [Han et al., 2012], which is shown to be more
efficient than the bag-of-feature (BOF) model [Sivic and Zisserman, 2003].
Local binary patterns [Ojala et al., 2002] on a Gaussian pyramid [Lindeberg,
1994] are employed to form feature tensors in [Ruiz-Hernandez et al., 2010a,b].
Gradient-based appearance cues are combined in a tensor form in [Wang et al.,
2011a], and wavelet transform [Antonini et al., 1992] and Gabor filters [Jain
and Farrokhnia, 1991] are used to generate higher-order tensors in [Li et al.,
2009a; Barnathan et al., 2010], and [Tao et al., 2007b], respectively. Figure 1.4
(a)
(b)
duction [Burges, 2010]. Here, we adopt the name most commonly known in the machine
learning and pattern recognition literature.
Introduction 7
(a) (b)
2001], linear discriminant analysis (LDA) [Duda et al., 2001], canonical cor-
relation analysis (CCA) [Hotelling, 1936], and partial least squares (PLS)
analysis [Wold et al., 2001]. To apply these linear subspace learning (LSL)
methods on tensor data of order higher than one, such as images and videos,
we have to reshape (vectorize) tensors into vectors first, that is, to convert
N -dimensional arrays (N > 1) to one-dimensional arrays, as depicted in Fig-
ure 1.6(a). Thus, LSL only partly alleviates the curse of dimensionality while
such reshaping, (i.e., vectorization), has two fundamental limitations:
• Vectorization breaks the natural structure and correlation in the original
data, reduces redundancies and/or higher order dependencies present
in the original dataset, and loses potentially more compact or useful
representations that can be obtained in the original tensor forms. For
example, Figure 1.7 shows a 2D face image of size 32 × 32 and its cor-
responding reshaped vector with size 1024 × 1 on the same scale. From
the 2D image (matrix) representation above, we can tell it is a face.
However, from the 1D vector representation below, we cannot tell what
it is.
• For higher-order tensor data of large size such as video sequences, the
reshaped vectors are very high dimensional. Analysis of these vectors
8 Multilinear Subspace Learning
(a)
(b)
Therefore, there has been a surging interest in more effective and efficient
dimensionality reduction schemes for tensor data. Methods working directly
on tensor representations have emerged as a promising approach. When we
use tensor-based analysis, tensors are processed directly without vectoriza-
tion. Figure 1.8(b) shows tensor-based analysis of the same 3D object in Fig-
ure 1.8(a), leading to three covariance matrices of size 128 × 128, 88 × 88, and
20 × 20. The total size of these three covariance matrices will be only about
95.8KB, which is several orders of magnitude smaller than the size of the co-
variance matrix in vector-based processing (95.8KB/189GB≈ 4.8 × 10−7 ). In
other words, vector-based processing will need about 2 millions times more
memory for the covariance matrix than that of tensor-based processing in this
case.
FIGURE 1.9: The field of matrix computations seems to “kick up” its level of
thinking about every 20 years. (Adapted from the report of the NSF Workshop
on “Future Directions in Tensor-Based Computation and Modeling,” 2009
[NSF, 2009].)
• It can handle big tensor data more efficiently with computations in much
lower dimensions than linear methods.
1.4 Roadmap
This book aims to provide a systematic and unified treatment of multilinear
subspace learning. As Hamming [1986] suggested in his inspiring talk “You
and Your Research,” we want to provide the essence of this specific field.
This first chapter has provided the motivation for MSL and a brief introduc-
tion to it. We started with the tensor representation of multidimensional data
and the need for dimensionality reduction to deal with the curse of dimen-
sionality. Then, we discussed how conventional linear subspace learning for
dimensionality reduction becomes inadequate for big tensor data as it needs
to reshape tensors into high-dimensional vectors. From tensor-level computa-
tional thinking, MSL was next introduced to learn compact representations
through direction multilinear mapping of tensors to alleviate those difficulties
encountered by their linear counterparts.
The rest of this book consists of two parts:
Part I covers the fundamentals and foundations of MSL. Chapter
2 reviews five basic LSL algorithms: PCA, ICA, LDA, CCA, and PLS. It
(a)
(b)
(c)
1.5 Summary
• Most big data are multidimensional.
Fundamentals and
Foundations
Chapter 2
Linear Subspace Learning for
Dimensionality Reduction
19
20 Multilinear Subspace Learning
M
ST = (xm − x̄)(xm − x̄)T , (2.1)
m=1
1 One of the rationales behind capturing the maximal variation is that noise can be
assumed to be uniformly spread so directions of high variation tend to have a higher signal-
to-noise ratio [Bie et al., 2005].
Linear Subspace Learning for Dimensionality Reduction 21
1
M
x̄ = xm . (2.2)
M m=1
In PCA, we want to find a linear mapping of the centered samples {xm − x̄}
to a low-dimensional representation {ym ∈ RP }, P < I, through a projection
matrix U ∈ RI×P as
ym = UT (xm − x̄). (2.3)
The objective of PCA is to maximize the variation captured by the projected
samples (or extracted features) {ym }. The projection matrix U can be consid-
ered to consist of P projection directions {u1 , u2 , ..., uP }. Then the projection
in each direction is
ymp = uTp (xm − x̄), (2.4)
where p = 1, ..., P . Let us define a coordinate vector gp ∈ RM for each p,
where the mth element of gp is defined as
PCA requires that each direction maximizes uTp ST up , and gp and gq are
uncorrelated for q = p, p, q = 1, ..., P .
M
ST = xm xTm = XXT , (2.6)
m=1
uT1 u1 = 1 since the maximum will not be achieved for finite u1 . The objective
then becomes
ũ1 = arg max uT1 ST u1 subject to uT1 u1 = 1. (2.7)
u1
3 Please refer to Section 2.4 of [Jolliffe, 2002] for the case of repeated eigenvalues.
Random documents with unrelated
content Scribd suggests to you:
in the most common cases of the labor-pains, more able to advise the
sick person to innocent remedies, where there is no complication in
the disorder, than those half-bred or ignorant pretenders: but if
there is a complication, then there must absolutely be a good
physician called in, the expence of which should not be regretted,
since life is at stake.
ANSWER.
I n the men, with all their boasted erudition, you cannot but
discern a certain, clumsy untowardly stiffness, an unaffectionate
perfunctory air, an ungainly management, that plainly prove it to be
an acquisition of art, or rather the rickety production of interest
begot upon art.
I f then, there are who can examine things fairly and with a sincere
desire of determining according to the preponderance of reason, they
cannot but on their own sense of nature, on their own feelings, in
short, discern that no ignorance, of which the women are
undistinguishingly taxed, can be an argument for the men’s
supplanting them in the practice of midwifery, on the strength of that
superiority of their learning, so rarely not perfectly superfluous, and
often dangerous, if not even destructive both to mother and child.
Consult nature, and her but too much despised oracle common
sense; consult even the writings of the men-midwives themselves,
and the resulting decision will be, that great reason there is to
believe, that the operation of the men-practitioners and
instrumentarians puts more women and infants to cruel and
torturous deaths, in the few countries where they are received, than
the ignorance of the midwives in all those countries put together
where the men-practitioners are not yet admitted, and where, for the
good of mankind, it is to be hoped they never will.
ANSWER.
B u t, in fact, the men, that is to say, those of that sex who have the
best understood all the refinements of anatomy, all the variety of
female distempers, never that I can learn, attempted to invade the
practical province of midwifery. The immortal Harvey, Sydenham,
the great Boerhave, Haller, and numbers of others who have written
so usefully upon all the objects of midwifery, have never pretended
or dropped a hint of the expedience of substituting men-midwives to
the female ones. They contented themselves with lamenting the
ignorance of some midwives, from which has been drawn a very just
inference of the necessity of their being better instructed; but even
those great men never chose the character of practitioners
themselves, nor probably would have thought it any detraction from
their merit to have it said, they might make a bad figure in the
function of delivering a woman.
T h e truth is, that most of the dangerous lyings-in are so far from
being likely to be relieved by a man-midwife, that it is often to the
having relied upon his medical judgment, and especially to his
manual skill they are owing. But of the first only it is we are now here
speaking.
Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.
ebookgate.com