100% found this document useful (7 votes)
34 views

Multi-aspect Learning: Methods and Applications Richi Nayak All Chapters Instant Download

Methods

Uploaded by

hosoidomecpm
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (7 votes)
34 views

Multi-aspect Learning: Methods and Applications Richi Nayak All Chapters Instant Download

Methods

Uploaded by

hosoidomecpm
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Experience Seamless Full Ebook Downloads for Every Genre at ebookmeta.

com

Multi-aspect Learning: Methods and Applications


Richi Nayak

https://ebookmeta.com/product/multi-aspect-learning-methods-
and-applications-richi-nayak/

OR CLICK BUTTON

DOWNLOAD NOW

Explore and download more ebook at https://ebookmeta.com


Recommended digital products (PDF, EPUB, MOBI) that
you can download immediately if you are interested.

Primary Mathematics 3A Hoerst

https://ebookmeta.com/product/primary-mathematics-3a-hoerst/

ebookmeta.com

Online Learning Systems: Methods and Applications with


Large-Scale Data Zdzislaw Polkowski (Editor)

https://ebookmeta.com/product/online-learning-systems-methods-and-
applications-with-large-scale-data-zdzislaw-polkowski-editor/

ebookmeta.com

Applications of Nanovesicular Drug Delivery 1st Edition


Amit Kumar Nayak (Editor)

https://ebookmeta.com/product/applications-of-nanovesicular-drug-
delivery-1st-edition-amit-kumar-nayak-editor/

ebookmeta.com

Restrictive Labor Practices in the Supermarket Industry


Herbert R. Northrup

https://ebookmeta.com/product/restrictive-labor-practices-in-the-
supermarket-industry-herbert-r-northrup/

ebookmeta.com
One Piece T31 Kcc

https://ebookmeta.com/product/one-piece-t31-kcc/

ebookmeta.com

Tennis 2021st Edition Greg Ruth

https://ebookmeta.com/product/tennis-2021st-edition-greg-ruth/

ebookmeta.com

The Witch s Curse Ruthless Mate First Edition Peggy Chan

https://ebookmeta.com/product/the-witch-s-curse-ruthless-mate-first-
edition-peggy-chan/

ebookmeta.com

Kozier and Erb's Fundamentals of Nursing 11th Edition


Audrey T. Berman

https://ebookmeta.com/product/kozier-and-erbs-fundamentals-of-
nursing-11th-edition-audrey-t-berman/

ebookmeta.com

Hostile Homelands The New Alliance Between India and


Israel Azad Essa

https://ebookmeta.com/product/hostile-homelands-the-new-alliance-
between-india-and-israel-azad-essa/

ebookmeta.com
Polarized Light and the Mueller Matrix Approach 2nd
Edition José J. Gil

https://ebookmeta.com/product/polarized-light-and-the-mueller-matrix-
approach-2nd-edition-jose-j-gil/

ebookmeta.com
Intelligent Systems Reference Library 242

Richi Nayak
Khanh Luong

Multi-aspect
Learning
Methods and Applications
Intelligent Systems Reference Library

Volume 242

Series Editors
Janusz Kacprzyk, Polish Academy of Sciences, Warsaw, Poland
Lakhmi C. Jain, KES International, Shoreham-by-Sea, UK
The aim of this series is to publish a Reference Library, including novel advances
and developments in all aspects of Intelligent Systems in an easily accessible and
well structured form. The series includes reference works, handbooks, compendia,
textbooks, well-structured monographs, dictionaries, and encyclopedias. It contains
well integrated knowledge and current information in the field of Intelligent Systems.
The series covers the theory, applications, and design methods of Intelligent Systems.
Virtually all disciplines such as engineering, computer science, avionics, business,
e-commerce, environment, healthcare, physics and life science are included. The list
of topics spans all the areas of modern intelligent systems such as: Ambient intelli-
gence, Computational intelligence, Social intelligence, Computational neuroscience,
Artificial life, Virtual society, Cognitive systems, DNA and immunity-based systems,
e-Learning and teaching, Human-centred computing and Machine ethics, Intelligent
control, Intelligent data analysis, Knowledge-based paradigms, Knowledge manage-
ment, Intelligent agents, Intelligent decision making, Intelligent network security,
Interactive entertainment, Learning paradigms, Recommender systems, Robotics
and Mechatronics including human-machine teaming, Self-organizing and adap-
tive systems, Soft computing including Neural systems, Fuzzy systems, Evolu-
tionary computing and the Fusion of these paradigms, Perception and Vision, Web
intelligence and Multimedia.
Indexed by SCOPUS, DBLP, zbMATH, SCImago.
All books published in the series are submitted for consideration in Web of Science.
Richi Nayak · Khanh Luong

Multi-aspect Learning
Methods and Applications
Richi Nayak Khanh Luong
Faculty of Science Faculty of Science
School of Computer Science School of Computer Science
Centre for Data Science Centre for Data Science
Queensland University of Technology Queensland University of Technology
Brisbane, Australia Brisbane, Australia

ISSN 1868-4394 ISSN 1868-4408 (electronic)


Intelligent Systems Reference Library
ISBN 978-3-031-33559-4 ISBN 978-3-031-33560-0 (eBook)
https://doi.org/10.1007/978-3-031-33560-0

© Springer Nature Switzerland AG 2023

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of
the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology
now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Contents

1 Multi-aspect Data Learning: Overview, Challenges


and Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Multi-type Relational Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Object Type, Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.2 Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.3 Bi-type Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Multi-view Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.1 Partial Multi-view Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3.2 Consensus and Complementary Information
in Multi-view Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4 Relationship Between MTRD and Multi-view Data . . . . . . . . . . . . . . 9
1.5 Real-World Multi-aspect Data Applications . . . . . . . . . . . . . . . . . . . . 10
1.5.1 Text Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.5.2 Image Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.5.3 Bio-informatics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.5.4 Social-Network Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.6 Multi-aspect Data Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.6.1 Why Design Customized Multi-aspect Clustering
Methods? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.6.2 Multi-aspect Data Clustering: Approaches . . . . . . . . . . . . . . . 16
1.7 Chapter Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2 Non-negative Matrix Factorization-Based Multi-aspect Data
Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.2 NMF Framework: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.3 NMF Framework: Basic Concepts and Definitions . . . . . . . . . . . . . . 30
2.3.1 NMF Formulation on Traditional Data . . . . . . . . . . . . . . . . . . 30
2.3.2 NMF-Based Clustering Process . . . . . . . . . . . . . . . . . . . . . . . . 31

v
vi Contents

2.3.3 Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.3.4 Benefits of NMF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.4 NMF-Based Clustering Methods on One-Aspect Data . . . . . . . . . . . 35
2.5 NMF-Based Clustering Methods on Multi-aspect Data . . . . . . . . . . . 37
2.5.1 NMF-Based Clustering Methods on Multi-view Data . . . . . . 37
2.5.2 NMF-Based Clustering Methods on MTRD Data . . . . . . . . . 43
2.6 Chapter Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3 NMF and Manifold Learning for Multi-aspect Data . . . . . . . . . . . . . . . 51
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.2 Introduction to Manifold Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.3 NMF and Manifold Learning-Based Clustering Methods
Applied to Traditional One-Aspect Data . . . . . . . . . . . . . . . . . . . . . . . 54
3.3.1 Manifold Learning Formulation . . . . . . . . . . . . . . . . . . . . . . . . 54
3.3.2 Manifold Learning Based NMF Methods
of Traditional One-Aspect Data . . . . . . . . . . . . . . . . . . . . . . . . 57
3.3.3 Challenges in Manifold Learning . . . . . . . . . . . . . . . . . . . . . . . 58
3.4 NMF and Manifold Learning-Based Clustering Methods
Applied to Multi-aspect Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.4.1 Learning the Manifold on Each Aspect
of the Multi-aspect Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.4.2 Learning the Accurate Manifold on Each Aspect . . . . . . . . . 65
3.4.3 Learning the Intrinsic Consensus Manifold
on Multi-view Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.4.4 Discussion: Manifold Learning Approaches
for Multi-view Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.5 Chapter Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4 Subspace Learning for Multi-aspect Data . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.2 Subspace Clustering: Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.3 One-Aspect Subspace Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.3.1 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.3.2 Subspace-Based Clustering Methods . . . . . . . . . . . . . . . . . . . . 81
4.4 Multi-aspect Subspace Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.4.1 Multi-view Clustering Definition . . . . . . . . . . . . . . . . . . . . . . . 85
4.4.2 Multi-view Subspace Clustering Using Early
Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.4.3 Multi-view Subspace Clustering Using Late
Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.4.4 Multi-view Subspace Clustering Using Intermediate
Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.4.5 Multi-view Subspace Clustering Using a Shared
Unified Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
Contents vii

4.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.6 Chapter Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5 Spectral Clustering on Multi-aspect Data . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.2 Spectral Clustering on Traditional One-Aspect Data . . . . . . . . . . . . . 105
5.2.1 Fundamental Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5.2.2 Spectral Clustering Approach on One-Aspect Data . . . . . . . . 108
5.3 Spectral Clustering Methods on Multi-aspect Data . . . . . . . . . . . . . . 110
5.3.1 Multi-view Spectral Clustering Methods . . . . . . . . . . . . . . . . . 110
5.3.2 MTRD Spectral Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
5.4 Chapter Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
6 Learning Consensus and Complementary Information
for Multi-aspect Data Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
6.2 Overview of Learning Consensus and Complementary
Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
6.3 Learning Consensus and Complementary Information Using
NMF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
6.3.1 NMF-Based Methods Focused on Learning
the Consensus Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
6.3.2 NMF-Based Methods Focused on Enhancing
the Complementary Information . . . . . . . . . . . . . . . . . . . . . . . . 136
6.3.3 NMF-Based Methods Focused on Enhancing
Consensus and Complementary Information Both . . . . . . . . 138
6.4 Learning Consensus and Complementary Information Using
Subspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
6.4.1 Subspace-Based Methods Learning the Consensus
Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
6.4.2 Subspace-Based Methods Learning
the Complementary Information . . . . . . . . . . . . . . . . . . . . . . . . 140
6.4.3 Subspace-Based Methods Learning Both Consensus
and Complementary Information . . . . . . . . . . . . . . . . . . . . . . . 141
6.5 Learning Consensus and Complementary Information Using
Spectral Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
6.5.1 Spectral Methods Learning Consensus
and Complementary Information . . . . . . . . . . . . . . . . . . . . . . . 144
6.6 Summary of Constraints and Regularizations Designed
for Learning the Consensus and Complementary Information . . . . . 146
6.6.1 For Learning the Consensus Information . . . . . . . . . . . . . . . . 147
6.6.2 For Learning the Complementary Information . . . . . . . . . . . . 147
6.7 Chapter Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
viii Contents

7 Deep Learning-Based Methods for Multi-aspect Data


Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
7.2 Autoencoder-Based Multi-view Data Clustering . . . . . . . . . . . . . . . . 153
7.2.1 Introduction to Autoencoder . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
7.2.2 AE-Based Clustering for One-View Data . . . . . . . . . . . . . . . . 155
7.2.3 AE-Based Clustering for Multi-view Data . . . . . . . . . . . . . . . 158
7.3 GAN-Based Multi-view Data Clustering . . . . . . . . . . . . . . . . . . . . . . . 164
7.3.1 Introduction to GAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
7.3.2 GAN-Based Clustering for One-View Data . . . . . . . . . . . . . . 165
7.3.3 GAN-Based Clustering for Multi-view Data . . . . . . . . . . . . . 168
7.4 Deep Matrix Factorization-Based Multi-view Clustering . . . . . . . . . 172
7.4.1 Deep Matrix Factorization (DMF)-Based Framework
Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
7.4.2 DMF-Based Clustering on One-View Data . . . . . . . . . . . . . . . 174
7.4.3 DMF-Based Clustering on Multi-view Data . . . . . . . . . . . . . . 176
7.4.4 Remarks: Deep Semi-NMF and Deep-NMF . . . . . . . . . . . . . . 181
7.5 Chapter Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
Chapter 1
Multi-aspect Data Learning: Overview,
Challenges and Approaches

Abstract Multi-aspect data, which represents information from multiple perspec-


tives, is becoming increasingly common and important. This is because such data
has the ability to incorporate diverse information, enabling machine learning algo-
rithms to accurately learn the relationships within the data and provide informative
outcomes. This chapter provides an overview of multi-aspect data and discusses the
various ways these aspects can be represented, such as different types or views. Sev-
eral examples of such datasets are presented. Additionally, the chapter briefly intro-
duces the multi-aspect data clustering problem and explores common approaches for
finding clustering solutions.

Keywords Multi-type relational data · Multi-view data · Multi-type relational


data clustering · Multi-view data clustering

1.1 Introduction

Multi-aspect data represents information about entities/objects (such as customers,


events etc.) from various perspectives, encompassing multiple types of relationships
or features. When a dataset contains multiple types of relationships to represent
information, it is referred to as Multi-type Relational Data (MTRD). On the other
hand, when a dataset represents information through multiple views, it is known
as Multi-view data. This chapter provides an overview of multi-aspect data and the
multi-aspect data clustering problem, with a focus on essential concepts. It analyzes
the relationship between MTRD and multi-view data, highlighting the challenges
associated with each dataset type.
In MTRD, there are various types of objects and multiple relationships among
these interrelated objects. Each object type is defined by a set of features and there
are several instances that consist of different combinations of these features. Each
instance is referred to as an entity or object, such as a customer or an event. Enti-
ties within the same object type exhibit intra-type relationship, while entities of
different object types encode inter-type relationship.

© Springer Nature Switzerland AG 2023 1


R. Nayak and K. Luong, Multi-aspect Learning, Intelligent Systems Reference
Library 242, https://doi.org/10.1007/978-3-031-33560-0_1
2 1 Multi-aspect Data Learning: Overview, Challenges and Approaches

An example of MTRD is information presented on a social network like Twitter. It


involves three types of objects: Users, Tweets and Terms. Objects within the “Users”
type have intra-type relationships based on connections such as retweets, replies and
mentions. Similarly, objects within the “Tweets” type have intra-type relationships
based on representation, such as common topics or mentions of a common location,
and so on. Additionally, objects across these object types share inter-type relation-
ships, such as posting (between Users and Tweets), using (between Users and Terms),
or containing (between Tweets and Terms), as illustrated in Fig. 1.1.
Another example of MTRD is a web search system that includes four object types:
Webpages, Users, Queries and Words, along with different types of relationships.
These relationships include issuing (between Users and Queries), viewing (between
Users and Webpages), referring (between Queries and Webpages) and containing
(between Queries and Words or between Webpages and Words), as illustrated in
Fig. 1.2.
Examples of multi-view data are provided in Fig. 1.3, where data can be sourced
from multiple sources and each source can be considered a view. For example,

Fig. 1.1 An example of multi-type relational data with three object types: Users, Tweets and Terms
and various relationships between them

Users Webpages

a11 a12 a13 a14 viewing 0 0 x 0 x


a21 a22 a23 a24 0 0 0 x 0
issuing containing
a31 a32 a33 a34 referencing x 0 0 x 0
a41 a42 a43 a44 0 0 x 0 x

Relationships between objects of containing Each inter-type relationship


each type will be encoded in each Queries Words between two different objects
affinity matrix, e.g., the pairwise types will be encoded in each
similarity matrix of users. inter-type relationship matrix, e.g.,
the webpages-words co-occurence
matrix.

Fig. 1.2 Another example of multi-type relational data with four object types: Users, Webpages,
Queries and Terms and various relationships between them
1.1 Introduction 3

Image

Text

a. b.
Edge

Fourier

Texture

c. d.

Fig. 1.3 Examples of multi-view data [72]

each newspaper forms a view in Fig. 1.3a (e.g., news from BBC, Yahoo etc.), or
each source can come in different forms such as text, image as in Fig. 1.3b. The
multilingual dataset as shown in Fig. 1.3c is a typical multi-view dataset where each
language type corresponds to a view and each document is represented by different
language translations. An image can be described by many different features such
as Edge, Fourier or Texture as in Fig. 1.3d. Each type of feature represents a unique
view in the dataset.
With the advancements in data collection and storage technologies, multi-aspect
data has become prevalent. Both MTRD and multi-view datasets share the character-
istic of providing complementary information for the learning process, but they focus
on different aspects, relationship types, or views. This book treats these two dataset
types as multi-aspect data, where different aspects in multi-view data correspond to
different views, and different object types along with their associated relationships
constitute different aspects in MTRD data.
Due to the incorporation of rich information, customized machine learning algo-
rithms are required to leverage multiple relationships and types present in multi-
aspect data, resulting in informative outcomes for supervised or unsupervised learn-
ing tasks. This book primarily focuses on the unsupervised machine learning
task of clustering, which aims to extract useful information from unlabelled data
by identifying natural groupings based on similarities.
Clustering has been extensively studied in various fields, including data mining,
text mining, image processing, web analysis, and bioinformatics. Clustering methods
designed for traditional data, where samples are represented by a single type of feature
or considered from a single view, need to be modified. Customized multi-aspect data
4 1 Multi-aspect Data Learning: Overview, Challenges and Approaches

Fig. 1.4 Content structure of the chapter

clustering methods should accurately explore and identify the underlying structure
while extracting useful and meaningful knowledge. Learning methods need to exploit
the latent relatedness between samples and different feature objects in MTRD, as
well as the latent properties of data in each view representation in multi-view data
[47, 52, 73].
Given the necessity of considering all available information to derive meaningful
outcomes and the inherent complexity of multi-aspect data, clustering on MTRD
and multi-view data remains a challenging problem that deserves careful attention.
A plethora of MTRD and multi-view clustering methods have been developed based
on concepts such as graph partitions (e.g., spectral clustering) [19, 38, 39, 47, 64],
subspace learning [12, 28, 75, 80], and nonnegative matrix factorization (NMF)
[33, 37, 73, 74, 94]. These approaches can identify underlying structures in multi-
aspect data and have demonstrated improved performance compared to traditional
clustering methods. Each approach has its own strengths and limitations, which will
be detailed in this book for readers’ understanding.
The structure of the chapter is depicted in Fig. 1.4.

1.2 Multi-type Relational Data

An MTRD dataset comprises various types of objects and multiple relationships


among these interconnected objects. Each aspect can be represented by an object
type in the dataset, along with the associated relationships between the object types.
For instance, the MTRD dataset shown in Fig. 1.5 has three aspects: (1) Webpages
and the relationships between Webpages and other object types; (2) Hyperlinks and
the relationships between Hyperlinks and other object types; and (3) Terms and the
relationships between Terms and other object types.
1.2 Multi-type Relational Data 5

Terms Hyperlinks
t1 h1
h2
t2 t3

t4 h3

t6 t5 h4

h5

t7 h6
t8 w1
t9 h7
w3
w2
w4

w5
w6

w8
w7
w9
Webpages

Fig. 1.5 An example of MTRD with three object types: Webpages, Terms, Hyperlinks. The intra-
type relationships are represented as solid lines and inter-type relationships are represented as dotted
lines

1.2.1 Object Type, Objects

An object type refers to a collection of objects of the same data type. For example, the
Webpages object type consists of several webpage objects or instances, denoted as
w1 , w2 , etc., in the MTRD web search system dataset presented in Fig. 1.5. MTRD
encompasses multiple object types. In the given example, there are three distinct
object types corresponding to three different aspects: Webpages, Hyperlinks, and
Terms. The Webpages object type is referred to as the sample object type, while
Hyperlinks and Terms are considered as feature object types.

1.2.2 Relationships
MTRD involves three types of relationships: inter-type, intra-type and association
relationships.

a. Inter-type relationship and Intra-type relationship

The inter-type relationship in MTRD describes the relationships between objects


of two different object types. Meanwhile, the intra-type relationship models the
6 1 Multi-aspect Data Learning: Overview, Challenges and Approaches

Fig. 1.6 Association


relationship

relationships between objects of the same type. In the example depicted in Fig. 1.5,
the inter-type relationships are the connections between objects of Webpages and
Terms, between objects of Webpages and Hyperlinks, and between objects of Terms
and Hyperlinks, indicated by dashed lines. On the other hand, the intra-type relation-
ships are the connections between objects within Webpages, within Hyperlinks, or
within Terms, represented by solid lines. Each inter-type or intra-type relationship
can be encoded in a matrix, which becomes an input for the MTRD learning process.

b. Association relationship

Intra-type and inter-type relationships capture the similarity between individual


objects within the object types. The association relationship captures the relationships
between groups of objects [47]. Identifying and utilizing the association relationship
or the interactions between groups to learn a higher-order to lower-order mapping can
enhance clustering performance. An illustration of association relationships between
groups is shown in Fig. 1.6, which represents the relations between different groups
of different object types. In the figure, the red clusters represent document groups
and the green clusters represent term groups. An association relationship can exist
between a document cluster and a term cluster, as indicated by the dashed line in the
figure.

1.2.3 Bi-type Data

A special case of MTRD is known as bi-type data, where the dataset contains only two
object types: the sample object type and the feature object type. The clustering task
on bi-type data is referred to as co-clustering or bi-clustering, where the clustering
task is performed simultaneously on the rows and columns of the input data matrix.
While the data samples are grouped based on their features, the features can be
clustered using their distributions across the data samples. There are various real-
world applications that exhibit the duality between data samples and features, which
Discovering Diverse Content Through
Random Scribd Documents
*** END OF THE PROJECT GUTENBERG EBOOK LE GARDIEN
DU FEU ***

Updated editions will replace the previous one—the old editions


will be renamed.

Creating the works from print editions not protected by U.S.


copyright law means that no one owns a United States copyright
in these works, so the Foundation (and you!) can copy and
distribute it in the United States without permission and without
paying copyright royalties. Special rules, set forth in the General
Terms of Use part of this license, apply to copying and
distributing Project Gutenberg™ electronic works to protect the
PROJECT GUTENBERG™ concept and trademark. Project
Gutenberg is a registered trademark, and may not be used if
you charge for an eBook, except by following the terms of the
trademark license, including paying royalties for use of the
Project Gutenberg trademark. If you do not charge anything for
copies of this eBook, complying with the trademark license is
very easy. You may use this eBook for nearly any purpose such
as creation of derivative works, reports, performances and
research. Project Gutenberg eBooks may be modified and
printed and given away—you may do practically ANYTHING in
the United States with eBooks not protected by U.S. copyright
law. Redistribution is subject to the trademark license, especially
commercial redistribution.

START: FULL LICENSE


THE FULL PROJECT GUTENBERG LICENSE
PLEASE READ THIS BEFORE YOU DISTRIBUTE OR USE THIS WORK

To protect the Project Gutenberg™ mission of promoting the


free distribution of electronic works, by using or distributing this
work (or any other work associated in any way with the phrase
“Project Gutenberg”), you agree to comply with all the terms of
the Full Project Gutenberg™ License available with this file or
online at www.gutenberg.org/license.

Section 1. General Terms of Use and


Redistributing Project Gutenberg™
electronic works
1.A. By reading or using any part of this Project Gutenberg™
electronic work, you indicate that you have read, understand,
agree to and accept all the terms of this license and intellectual
property (trademark/copyright) agreement. If you do not agree to
abide by all the terms of this agreement, you must cease using
and return or destroy all copies of Project Gutenberg™
electronic works in your possession. If you paid a fee for
obtaining a copy of or access to a Project Gutenberg™
electronic work and you do not agree to be bound by the terms
of this agreement, you may obtain a refund from the person or
entity to whom you paid the fee as set forth in paragraph 1.E.8.

1.B. “Project Gutenberg” is a registered trademark. It may only


be used on or associated in any way with an electronic work by
people who agree to be bound by the terms of this agreement.
There are a few things that you can do with most Project
Gutenberg™ electronic works even without complying with the
full terms of this agreement. See paragraph 1.C below. There
are a lot of things you can do with Project Gutenberg™
electronic works if you follow the terms of this agreement and
help preserve free future access to Project Gutenberg™
electronic works. See paragraph 1.E below.
1.C. The Project Gutenberg Literary Archive Foundation (“the
Foundation” or PGLAF), owns a compilation copyright in the
collection of Project Gutenberg™ electronic works. Nearly all the
individual works in the collection are in the public domain in the
United States. If an individual work is unprotected by copyright
law in the United States and you are located in the United
States, we do not claim a right to prevent you from copying,
distributing, performing, displaying or creating derivative works
based on the work as long as all references to Project
Gutenberg are removed. Of course, we hope that you will
support the Project Gutenberg™ mission of promoting free
access to electronic works by freely sharing Project
Gutenberg™ works in compliance with the terms of this
agreement for keeping the Project Gutenberg™ name
associated with the work. You can easily comply with the terms
of this agreement by keeping this work in the same format with
its attached full Project Gutenberg™ License when you share it
without charge with others.

1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside
the United States, check the laws of your country in addition to
the terms of this agreement before downloading, copying,
displaying, performing, distributing or creating derivative works
based on this work or any other Project Gutenberg™ work. The
Foundation makes no representations concerning the copyright
status of any work in any country other than the United States.

1.E. Unless you have removed all references to Project


Gutenberg:

1.E.1. The following sentence, with active links to, or other


immediate access to, the full Project Gutenberg™ License must
appear prominently whenever any copy of a Project
Gutenberg™ work (any work on which the phrase “Project
Gutenberg” appears, or with which the phrase “Project
Gutenberg” is associated) is accessed, displayed, performed,
viewed, copied or distributed:

This eBook is for the use of anyone anywhere in the United


States and most other parts of the world at no cost and with
almost no restrictions whatsoever. You may copy it, give it
away or re-use it under the terms of the Project Gutenberg
License included with this eBook or online at
www.gutenberg.org. If you are not located in the United
States, you will have to check the laws of the country where
you are located before using this eBook.

1.E.2. If an individual Project Gutenberg™ electronic work is


derived from texts not protected by U.S. copyright law (does not
contain a notice indicating that it is posted with permission of the
copyright holder), the work can be copied and distributed to
anyone in the United States without paying any fees or charges.
If you are redistributing or providing access to a work with the
phrase “Project Gutenberg” associated with or appearing on the
work, you must comply either with the requirements of
paragraphs 1.E.1 through 1.E.7 or obtain permission for the use
of the work and the Project Gutenberg™ trademark as set forth
in paragraphs 1.E.8 or 1.E.9.

1.E.3. If an individual Project Gutenberg™ electronic work is


posted with the permission of the copyright holder, your use and
distribution must comply with both paragraphs 1.E.1 through
1.E.7 and any additional terms imposed by the copyright holder.
Additional terms will be linked to the Project Gutenberg™
License for all works posted with the permission of the copyright
holder found at the beginning of this work.

1.E.4. Do not unlink or detach or remove the full Project


Gutenberg™ License terms from this work, or any files
containing a part of this work or any other work associated with
Project Gutenberg™.
1.E.5. Do not copy, display, perform, distribute or redistribute
this electronic work, or any part of this electronic work, without
prominently displaying the sentence set forth in paragraph 1.E.1
with active links or immediate access to the full terms of the
Project Gutenberg™ License.

1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if
you provide access to or distribute copies of a Project
Gutenberg™ work in a format other than “Plain Vanilla ASCII” or
other format used in the official version posted on the official
Project Gutenberg™ website (www.gutenberg.org), you must, at
no additional cost, fee or expense to the user, provide a copy, a
means of exporting a copy, or a means of obtaining a copy upon
request, of the work in its original “Plain Vanilla ASCII” or other
form. Any alternate format must include the full Project
Gutenberg™ License as specified in paragraph 1.E.1.

1.E.7. Do not charge a fee for access to, viewing, displaying,


performing, copying or distributing any Project Gutenberg™
works unless you comply with paragraph 1.E.8 or 1.E.9.

1.E.8. You may charge a reasonable fee for copies of or


providing access to or distributing Project Gutenberg™
electronic works provided that:

• You pay a royalty fee of 20% of the gross profits you derive from
the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”

• You provide a full refund of any money paid by a user who


notifies you in writing (or by e-mail) within 30 days of receipt that
s/he does not agree to the terms of the full Project Gutenberg™
License. You must require such a user to return or destroy all
copies of the works possessed in a physical medium and
discontinue all use of and all access to other copies of Project
Gutenberg™ works.

• You provide, in accordance with paragraph 1.F.3, a full refund of


any money paid for a work or a replacement copy, if a defect in
the electronic work is discovered and reported to you within 90
days of receipt of the work.

• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.

1.E.9. If you wish to charge a fee or distribute a Project


Gutenberg™ electronic work or group of works on different
terms than are set forth in this agreement, you must obtain
permission in writing from the Project Gutenberg Literary
Archive Foundation, the manager of the Project Gutenberg™
trademark. Contact the Foundation as set forth in Section 3
below.

1.F.

1.F.1. Project Gutenberg volunteers and employees expend


considerable effort to identify, do copyright research on,
transcribe and proofread works not protected by U.S. copyright
law in creating the Project Gutenberg™ collection. Despite
these efforts, Project Gutenberg™ electronic works, and the
medium on which they may be stored, may contain “Defects,”
such as, but not limited to, incomplete, inaccurate or corrupt
data, transcription errors, a copyright or other intellectual
property infringement, a defective or damaged disk or other
medium, a computer virus, or computer codes that damage or
cannot be read by your equipment.

1.F.2. LIMITED WARRANTY, DISCLAIMER OF DAMAGES -


Except for the “Right of Replacement or Refund” described in
paragraph 1.F.3, the Project Gutenberg Literary Archive
Foundation, the owner of the Project Gutenberg™ trademark,
and any other party distributing a Project Gutenberg™ electronic
work under this agreement, disclaim all liability to you for
damages, costs and expenses, including legal fees. YOU
AGREE THAT YOU HAVE NO REMEDIES FOR NEGLIGENCE,
STRICT LIABILITY, BREACH OF WARRANTY OR BREACH
OF CONTRACT EXCEPT THOSE PROVIDED IN PARAGRAPH
1.F.3. YOU AGREE THAT THE FOUNDATION, THE
TRADEMARK OWNER, AND ANY DISTRIBUTOR UNDER
THIS AGREEMENT WILL NOT BE LIABLE TO YOU FOR
ACTUAL, DIRECT, INDIRECT, CONSEQUENTIAL, PUNITIVE
OR INCIDENTAL DAMAGES EVEN IF YOU GIVE NOTICE OF
THE POSSIBILITY OF SUCH DAMAGE.

1.F.3. LIMITED RIGHT OF REPLACEMENT OR REFUND - If


you discover a defect in this electronic work within 90 days of
receiving it, you can receive a refund of the money (if any) you
paid for it by sending a written explanation to the person you
received the work from. If you received the work on a physical
medium, you must return the medium with your written
explanation. The person or entity that provided you with the
defective work may elect to provide a replacement copy in lieu
of a refund. If you received the work electronically, the person or
entity providing it to you may choose to give you a second
opportunity to receive the work electronically in lieu of a refund.
If the second copy is also defective, you may demand a refund
in writing without further opportunities to fix the problem.

1.F.4. Except for the limited right of replacement or refund set


forth in paragraph 1.F.3, this work is provided to you ‘AS-IS’,
WITH NO OTHER WARRANTIES OF ANY KIND, EXPRESS
OR IMPLIED, INCLUDING BUT NOT LIMITED TO
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR
ANY PURPOSE.

1.F.5. Some states do not allow disclaimers of certain implied


warranties or the exclusion or limitation of certain types of
damages. If any disclaimer or limitation set forth in this
agreement violates the law of the state applicable to this
agreement, the agreement shall be interpreted to make the
maximum disclaimer or limitation permitted by the applicable
state law. The invalidity or unenforceability of any provision of
this agreement shall not void the remaining provisions.

1.F.6. INDEMNITY - You agree to indemnify and hold the


Foundation, the trademark owner, any agent or employee of the
Foundation, anyone providing copies of Project Gutenberg™
electronic works in accordance with this agreement, and any
volunteers associated with the production, promotion and
distribution of Project Gutenberg™ electronic works, harmless
from all liability, costs and expenses, including legal fees, that
arise directly or indirectly from any of the following which you do
or cause to occur: (a) distribution of this or any Project
Gutenberg™ work, (b) alteration, modification, or additions or
deletions to any Project Gutenberg™ work, and (c) any Defect
you cause.

Section 2. Information about the Mission of


Project Gutenberg™
Project Gutenberg™ is synonymous with the free distribution of
electronic works in formats readable by the widest variety of
computers including obsolete, old, middle-aged and new
computers. It exists because of the efforts of hundreds of
volunteers and donations from people in all walks of life.

Volunteers and financial support to provide volunteers with the


assistance they need are critical to reaching Project
Gutenberg™’s goals and ensuring that the Project Gutenberg™
collection will remain freely available for generations to come. In
2001, the Project Gutenberg Literary Archive Foundation was
created to provide a secure and permanent future for Project
Gutenberg™ and future generations. To learn more about the
Project Gutenberg Literary Archive Foundation and how your
efforts and donations can help, see Sections 3 and 4 and the
Foundation information page at www.gutenberg.org.

Section 3. Information about the Project


Gutenberg Literary Archive Foundation
The Project Gutenberg Literary Archive Foundation is a non-
profit 501(c)(3) educational corporation organized under the
laws of the state of Mississippi and granted tax exempt status by
the Internal Revenue Service. The Foundation’s EIN or federal
tax identification number is 64-6221541. Contributions to the
Project Gutenberg Literary Archive Foundation are tax
deductible to the full extent permitted by U.S. federal laws and
your state’s laws.

The Foundation’s business office is located at 809 North 1500


West, Salt Lake City, UT 84116, (801) 596-1887. Email contact
links and up to date contact information can be found at the
Foundation’s website and official page at
www.gutenberg.org/contact

Section 4. Information about Donations to


the Project Gutenberg Literary Archive
Foundation
Project Gutenberg™ depends upon and cannot survive without
widespread public support and donations to carry out its mission
of increasing the number of public domain and licensed works
that can be freely distributed in machine-readable form
accessible by the widest array of equipment including outdated
equipment. Many small donations ($1 to $5,000) are particularly
important to maintaining tax exempt status with the IRS.

The Foundation is committed to complying with the laws


regulating charities and charitable donations in all 50 states of
the United States. Compliance requirements are not uniform
and it takes a considerable effort, much paperwork and many
fees to meet and keep up with these requirements. We do not
solicit donations in locations where we have not received written
confirmation of compliance. To SEND DONATIONS or
determine the status of compliance for any particular state visit
www.gutenberg.org/donate.

While we cannot and do not solicit contributions from states


where we have not met the solicitation requirements, we know
of no prohibition against accepting unsolicited donations from
donors in such states who approach us with offers to donate.

International donations are gratefully accepted, but we cannot


make any statements concerning tax treatment of donations
received from outside the United States. U.S. laws alone swamp
our small staff.

Please check the Project Gutenberg web pages for current


donation methods and addresses. Donations are accepted in a
number of other ways including checks, online payments and
credit card donations. To donate, please visit:
www.gutenberg.org/donate.

Section 5. General Information About Project


Gutenberg™ electronic works
Professor Michael S. Hart was the originator of the Project
Gutenberg™ concept of a library of electronic works that could
be freely shared with anyone. For forty years, he produced and
distributed Project Gutenberg™ eBooks with only a loose
network of volunteer support.

Project Gutenberg™ eBooks are often created from several


printed editions, all of which are confirmed as not protected by
copyright in the U.S. unless a copyright notice is included. Thus,
we do not necessarily keep eBooks in compliance with any
particular paper edition.

Most people start at our website which has the main PG search
facility: www.gutenberg.org.

This website includes information about Project Gutenberg™,


including how to make donations to the Project Gutenberg
Literary Archive Foundation, how to help produce our new
eBooks, and how to subscribe to our email newsletter to hear
about new eBooks.

You might also like