Download Complete SQL and NoSQL Databases: Modeling, Languages, Security and Architectures for Big Data Management Michael Kaufmann PDF for All Chapters
Download Complete SQL and NoSQL Databases: Modeling, Languages, Security and Architectures for Big Data Management Michael Kaufmann PDF for All Chapters
com
https://ebookmeta.com/product/sql-and-nosql-databases-
modeling-languages-security-and-architectures-for-big-data-
management-michael-kaufmann/
OR CLICK BUTTON
DOWNLOAD NOW
https://ebookmeta.com/product/python-data-persistence-with-sql-and-
nosql-databases-1st-edition-lathkar/
ebookmeta.com
https://ebookmeta.com/product/nosql-and-sql-data-modeling-bringing-
together-data-semantics-and-software-first-edition-hills/
ebookmeta.com
https://ebookmeta.com/product/the-divorce-from-hell-margie-majors-
middle-aged-vampire-slayer-2-2nd-edition-dewylde-saranna/
ebookmeta.com
Optical Communications in the 5G Era 1st Edition Xiang Liu
https://ebookmeta.com/product/optical-communications-in-
the-5g-era-1st-edition-xiang-liu/
ebookmeta.com
https://ebookmeta.com/product/the-atheist-manifesto-2nd-edition-
christopher-hitchens/
ebookmeta.com
https://ebookmeta.com/product/monster-girl-safari-3-1st-edition-
roland-carlsson/
ebookmeta.com
Loveless Osemanverse 10 1st Edition Alice Oseman
https://ebookmeta.com/product/loveless-osemanverse-10-1st-edition-
alice-oseman/
ebookmeta.com
Michael Kaufmann
Andreas Meier
Second Edition
Michael Kaufmann Andreas Meier
Informatik Institute of Informatics
Hochschule Luzern Universität Fribourg
Rotkreuz, Switzerland Fribourg, Switzerland
# The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland
AG 2023
The first edition of this book was published by Springer Vieweg in 2019
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or
dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Foreword
The term database has long since become part of people’s everyday vocabulary, for
managers and clerks as well as students of most subjects. They use it to describe a
logically organized collection of electronically stored data that can be directly
searched and viewed. However, they are generally more than happy to leave the
whys and hows of its inner workings to the experts.
Users of databases are rarely aware of the immaterial and concrete business
values contained in any individual database. This applies as much to a car importer’s
spare parts inventory as the IT solution containing all customer depots at a bank or
the patient information system of a hospital. Yet failure of these systems, or even
cumulative errors, can threaten the very existence of the respective company or
institution. For that reason, it is important for a much larger audience than just the
“database specialists” to be well-informed about what is going on. Anyone involved
with databases should understand what these tools are effectively able to do and
which conditions must be created and maintained for them to do so.
Probably the most important aspect concerning databases involves (a) the dis-
tinction between their administration and the data stored in them (user data) and
(b) the economic magnitude of these two areas. Database administration consists of
various technical and administrative factors, from computers, database systems, and
additional storage to the experts setting up and maintaining all these components—
the aforementioned database specialists. It is crucial to keep in mind that the
administration is by far the smaller part of standard database operation, constituting
only about a quarter of the entire efforts.
Most of the work and expenses concerning databases lie in gathering,
maintaining, and utilizing the user data. This includes the labor costs for all
employees who enter data into the database, revise it, retrieve information from
the database, or create files using this information. In the above examples, this means
warehouse employees, bank tellers, or hospital personnel in a wide variety of
fields—usually for several years.
In order to be able to properly evaluate the importance of the tasks connected with
data maintenance and utilization on the one hand and database administration on the
other hand, it is vital to understand and internalize this difference in the effort
required for each of them. Database administration starts with the design of the
database, which already touches on many specialized topics such as determining the
v
vi Foreword
consistency checks for data manipulation or regulating data redundancies, which are
as undesirable on the logical level as they are essential on the storage level. The
development of database solutions is always targeted on their later use, so
ill-considered decisions in the development process may have a permanent impact
on everyday operations. Finding ideal solutions, such as the golden mean between
too strict and too flexible when determining consistency conditions, may require
some experience. Unduly strict conditions will interfere with regular operations,
while excessively lax rules will entail a need for repeated expensive data repairs.
To avoid such issues, it is invaluable for anyone concerned with database
development and operation, whether in management or as a database specialist, to
gain systematic insight into this field of computer sciences. The table of contents
gives an overview of the wide variety of topics covered in this book. The title already
shows that, in addition to an in-depth explanation of the field of conventional
databases (relational model, SQL), the book also provides highly educational infor-
mation about current advancements and related fields, the keywords being NoSQL
and Big Data. I am confident that the newest edition of this book will once again be
well-received by both students and professionals—its authors are quite familiar with
both groups.
It is remarkable how stable some concepts are in the field of databases. Information
technology is generally known to be subject to rapid development, bringing forth
new technologies at an unbelievable pace. However, this is only superficially the
case. Many aspects of computer science do not essentially change. This includes not
only the basics, such as the functional principles of universal computing machines,
processors, compilers, operating systems, databases and information systems, and
distributed systems, but also computer language technologies such as C, TCP/IP, or
HTML that are decades old but in many ways provide a stable fundament of the
global, earth-spanning information system known as the World Wide Web. Like-
wise, the SQL language (Structured Query Language) has been in use for almost five
decades and will remain so in the foreseeable future. The theory of relational
database systems was initiated in the 1970s by Codd (relation model) and
Chamberlin and Boyce (SEQUEL). However, these technologies have a major
impact on the practice of data management today. Especially, with the Big Data
revolution and the widespread use of data science methods for decision support,
relational databases and the use of SQL for data analysis are actually becoming more
important. Even though sophisticated statistics and machine learning are enhancing
the possibilities for knowledge extraction from data, many if not most data analyses
for decision support rely on descriptive statistics using SQL for grouped aggrega-
tion. SQL is also used in the field of Big Data with MapReduce technology. In this
sense, although SQL database technology is quite mature, it is more relevant today
than ever.
Nevertheless, the developments in the Big Data ecosystem brought new
technologies into the world of databases, to which we pay enough attention too.
Non-relational database technologies, which find more and more fields of applica-
tion under the generic term NoSQL, differ not only superficially from the classical
relational databases but also in the underlying principles. Relational databases were
developed in the twentieth century with the purpose of tightly organized, operational
forms of data management, which provided stability but limited flexibility. In
contrast, the NoSQL database movement emerged in the beginning of the new
century, focusing on horizontal partitioning, schema flexibility, and index-free
neighborhood with the goal of solving the Big Data problems of volume, variety,
and velocity, especially in Web-scale data systems. This has far-reaching
vii
viii Preface
1 Database Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Information Systems and Databases . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 SQL Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.1 Relational Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.2 Structured Query Language SQL . . . . . . . . . . . . . . . . . . . 6
1.2.3 Relational Database Management System . . . . . . . . . . . . . 8
1.3 Big Data and NoSQL Databases . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3.1 Big Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3.2 NoSQL Database Management System . . . . . . . . . . . . . . . 12
1.4 Graph Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.4.1 Graph-Based Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.4.2 Graph Query Language Cypher . . . . . . . . . . . . . . . . . . . . 15
1.5 Document Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.5.1 Document Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.5.2 Document-Oriented Database Language MQL . . . . . . . . . . 19
1.6 Organization of Data Management . . . . . . . . . . . . . . . . . . . . . . . . 21
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2 Database Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.1 From Requirements Analysis to Database . . . . . . . . . . . . . . . . . . . 25
2.2 The Entity-Relationship Model . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.2.1 Entities and Relationships . . . . . . . . . . . . . . . . . . . . . . . . 28
2.2.2 Associations and Association Types . . . . . . . . . . . . . . . . . 29
2.2.3 Generalization and Aggregation . . . . . . . . . . . . . . . . . . . . 32
2.3 Implementation in the Relational Model . . . . . . . . . . . . . . . . . . . . 35
2.3.1 Dependencies and Normal Forms . . . . . . . . . . . . . . . . . . . 35
2.3.2 Mapping Rules for Relational Databases . . . . . . . . . . . . . . 42
2.4 Implementation in the Graph Model . . . . . . . . . . . . . . . . . . . . . . . 47
2.4.1 Graph Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.4.2 Mapping Rules for Graph Databases . . . . . . . . . . . . . . . . . 51
2.5 Implementation in the Document Model . . . . . . . . . . . . . . . . . . . . 55
2.5.1 Document-Oriented Database Modeling . . . . . . . . . . . . . . 55
xi
xii Contents
Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
Database Management
1
The evolution from the industrial society via the service society to the information
and knowledge society is represented by the assessment of information as a factor in
production. The following characteristics distinguish information from material
goods:
These properties clearly show that digital goods (information, software, multime-
dia, etc.), i.e., data, are vastly different from material goods in both handling and
economic or legal evaluation. A good example is the loss in value that physical
products often experience when they are used—the shared use of information, on the
other hand, may increase its value. Another difference lies in the potentially high
production costs for material goods, while information can be multiplied easily and
at significantly lower costs (only computing power and storage medium). This
causes difficulties in determining property rights and ownership, even though digital
watermarks and other privacy and security measures are available.
Information System
Communication
Database System Application Software network
or WWW
User
User guidance
Database Dialog design Request
Management
Business logic
Data querying Response
Data manipulation
Database Access permissions
Storage Data protection
1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if
you provide access to or distribute copies of a Project
Gutenberg™ work in a format other than “Plain Vanilla ASCII” or
other format used in the official version posted on the official
Project Gutenberg™ website (www.gutenberg.org), you must, at
no additional cost, fee or expense to the user, provide a copy, a
means of exporting a copy, or a means of obtaining a copy upon
request, of the work in its original “Plain Vanilla ASCII” or other
form. Any alternate format must include the full Project
Gutenberg™ License as specified in paragraph 1.E.1.
• You pay a royalty fee of 20% of the gross profits you derive from
the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”
• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.
1.F.
Most people start at our website which has the main PG search
facility: www.gutenberg.org.