Computing Patterns in Strings 1st Edition Bill
Smyth pdf download
https://ebookmass.com/product/computing-patterns-in-strings-1st-
edition-bill-smyth/
Explore and download more ebooks at ebookmass.com
We believe these products will be a great fit for you. Click
the link to download now, or visit ebookmass.com
to discover even more!
Edge Computing Patterns for Solution Architects Anonymous
https://ebookmass.com/product/edge-computing-patterns-for-solution-
architects-anonymous/
Social Value in Public Policy 1st ed. Edition Bill Jordan
https://ebookmass.com/product/social-value-in-public-policy-1st-ed-
edition-bill-jordan/
Intuition in Kant: The Boundlessness of Sense Daniel Smyth
https://ebookmass.com/product/intuition-in-kant-the-boundlessness-of-
sense-daniel-smyth/
The Law and Business Administration in Canada 15c Edition
J.E. Smyth
https://ebookmass.com/product/the-law-and-business-administration-in-
canada-15c-edition-j-e-smyth/
Introduction to Quantum Computing (River Publishers Series
in Rapids in Computing and Information Science and
Technology) 1st Edition Ahmed Banafa
https://ebookmass.com/product/introduction-to-quantum-computing-river-
publishers-series-in-rapids-in-computing-and-information-science-and-
technology-1st-edition-ahmed-banafa/
Engaging with Brecht. Making Theatre in the Twenty-first
Century 1st Edition Bill Gelber
https://ebookmass.com/product/engaging-with-brecht-making-theatre-in-
the-twenty-first-century-1st-edition-bill-gelber/
Knitting Patterns For Dummies 1st Edition Kristi Porter
https://ebookmass.com/product/knitting-patterns-for-dummies-1st-
edition-kristi-porter/
The Invisible Republic: The Economics of Socialism and
Republicanism in the 21st Century Robbie Smyth
https://ebookmass.com/product/the-invisible-republic-the-economics-of-
socialism-and-republicanism-in-the-21st-century-robbie-smyth/
Patterns in Mathematics Classroom Interaction: A
Conversation Analytic Approach Jenni Ingram
https://ebookmass.com/product/patterns-in-mathematics-classroom-
interaction-a-conversation-analytic-approach-jenni-ingram/
Bill Smyth McMaster University Curtin University of Technology
Pearson Education Limited Edinburgh Gate Harlow Essex CM20 2JE
England and Associated Companies throughout the world Visit us on
the World Wide Web at: www.pearsoneduc.com First published 2003
© Pearson Education Limited 2003 The right of William F. Smyth to
be identified as author of this work has been asserted by him in
accordance with the Copyright, Designs and Patents Act 1988. All
rights reserved. No part of this publication may be reproduced,
stored in a retrieval system, or transmitted in any form or by any
means, electronic, mechanical, photocopying, recording or
otherwise, without either the prior written permission of the
publisher or a licence permitting restricted copying in the United
Kingdom issued by the Copyright Licensing Agency Ltd, 90
Tottenham Court Road, London WIT 4LP. All trademarks used herein
are the property of their respective owners. The use of any
trademark in this text does not vest in the author or publisher any
trademark ownership rights in such trademarks, nor does the use of
such trademarks imply any affiliation with or endorsement of this
book by such owners. ISBN 0 201 39839 7 British Library
Cataloguing-in-Publication Data A catalog record for this book is
available from the British Library Library of Congress Cataloging-in-
Publication Data A catalog record for this book is available from the
Library of Congress 10987654321 07 06 05 04 03 Typeset by 68
Printed and bound in Great Britain by Biddies Ltd, Guildford and
King's Lynn
Computing Patterns in Strings
PEARSON Education We work with leading authors to develop the
strongest educational materials in computer science, bringing
cutting-edge thinking and best learning practice to a global market.
Under a range of well-known imprints, including Addison-Wesley, we
craft high-quality print and electronic publications which help readers
to understand and apply their content, whether studying or at work.
To find out more about the complete range of our publishing, please
visit us on the World Wide Web at: www.pearsoneduc.com
V .-.i Preface Part I Strings and Algorithms Chapter 1 Properties of
Strings 1.1 Strings of Pearls 1.2 Linear Strings 1.3 Periodicity 1.4
Necklaces Chapter 2 Patterns? What Patterns? 2.1 Intrinsic Patterns
(Part II) 2.2 Specific Patterns (Part III) 2.3 Generic Patterns (Part IV)
Chapter 3 Strings Famous and Infamous 3.1 Avoidance Problems
and Morphisms 3.2 Thue Strings B,3) 3.3 Thue Strings C,2) 3.4
Fibostrings B,4) Chapter 4 Good Algorithms and Good Test Data 4.1
Good Algorithms 4.2 Distinct Patterns 4.3 Distinct Borders IX 1 3 3 5
14 25 35 35 41 51 61 61 65 72 76 89 89 94 100
vi Contents Part II Computing Intrinsic Patterns 109 Chapter 5 Trees
Derived from Strings 111 5.1 Border Trees 111 5.2 Suffix Trees 113
5.2.1 Preliminaries 114 5.2.2 McCreight's Algorithm 117 5.2.3
Ukkonen's Algorithm 121 5.2.4 Farach's Algorithm 126 5.2.5
Application and Implementation 137 5.3 Alternative Suffix-Based
Structures 140 5.3.1 Directed Acyclic Word Graphs 140 5.3.2 Suffix
Arrays — Saving the Best till Last? 149 Chapter 6 Decomposing a
String 157 6.1 Lyndon Decomposition: Duval's Algorithm 158 6.2
Lyndon Applications 167 6.3 s-Factorization: Lempel-Ziv 175 Part III
Computing Specific Patterns 179 Chapter 7 Basic Algorithms 181 7.1
Knuth-Morris-Pratt 181 7.2 Boyer-Moore 187 7.3 Karp-Rabin 198 7.4
D6molki-(Baeza-Yates)-Gonnet 202 7.5 Summary 206 Chapter 8 Son
of BM Rides Again! 207 8.1 The BM Skip Loop 208 8.2 BM-Horspool
210 8.3 Frequency Considerations and BM-Sunday 212 8.4 BM-Galil
219 8.5 Turbo-BM 223 8.6 Daughter of KMP Rides Too! 226 8.7 Mix
Your Own Algorithm 231 8.8 The Exact Complexity of Exact Pattern-
Matching 233 Chapter 9 String Distance Algorithms 237 9.1 The
Basic Recurrence 238 9.2 Wagner-Fischer etal. 241 9.3 Hirschberg
244 9.4 Hunt-Szymanski 250
Contents vjj 9.5 Ukkonen-Myers 256 9.6 Summary 263 Chapter 10
Approximate Pattern-Matching 265 10.1 A General Distance-Based
Algorithm 266 10.2 An Algorithm for ^-Mismatches 269 10.3
Algorithms for ^-Differences 274 10.3.1 Ukkonen's Algorithm 276
10.3.2 Myers'Algorithm 279 10.4 A Fast and Flexible Algorithm — Wu
and Manber 286 10.5 The Complexity of Approximate Pattern-
Matching 292 Chapter 11 Regular Expressions and Multiple Patterns
295 11.1 Regular Expression Algorithms 297 11.1.1 Non-
Deterministic FA 297 11.1.2 Deterministic FA 302 11.1.3 Algorithm
WM Revisited 305 11.2 Multiple Pattern Algorithms 309 11.2.1 Aho-
Corasick FA: KMP Revisited 309 11.2.2 Commentz-Walter FA: BM
Revisited 313 11.2.3 Approximate Patterns: WM Revisited Again! 315
11.2.4 Approximate Patterns: (Baeza-Yates)-Navarro 317 Part IV
Computing Generic Patterns 327 Chapter 12 Repetitions (Periodicity)
329 12.1 All Repetitions 331 12.1.1 Crochemore 331 12.1.2 Main and
Lorentz 340 12.2 Runs 349 12.2.1 Leftmost Runs — Main 350 12.2.2
All Runs — Kolpakov and Kucherov 356 Chapter 13 Extensions of
Periodicity 359 13.1 All Covers of a String —Algorithm LS 360 13.2
All Repeats — Algorithm FST 370 13.2.1 Computing the NE Tree 372
13.2.2 Computing the NE Array 375 13.3 A;-Approximate Repeats —
Schmidt 380 13.4 ^-Approximate Periods — SIPS 396 Bibliography
403 Index 415
In the beginning was the Word, and the Word was with God, and the
Word was God. —John 1:1 The computation of patterns in strings is
a fundamental requirement in many areas of science and information
processing. The operation of a text editor, the lexical analysis of a
computer program, the functioning of a finite automaton, the
retrieval of information from a database — these are all activities
which may require that patterns be located and/or computed. In
other areas of science, the algorithms that compute patterns have
applications in such diverse fields as data compression, cryptology,
speech recognition, computer vision, computational geometry, and
molecular biology. And computing patterns in strings is not a topic
whose importance lies only in its current practical applications: it is a
branch of combinatorics that includes many simply-stated problems
which often turn out to have solutions — and often more solutions
than one — of great subtlety and elegance. It is surprising therefore
that academic Departments of Mathematics or Computer Science do
not generally include in their undergraduate or graduate curricula
courses which provide an introduction to this interesting, important,
and heavily researched topic. It is perhaps even more surprising that
so few texts have been written with the purpose of putting together
in a uniform way some of the basic results and algorithms that have
appeared over the past quarter-century. I know of five books [St94,
CR94, G97, SM97, CHL01] and three fairly long survey articles [B
Y89a, A90, NavOl] whose subject matter overlaps significantly with
that of this volume. Of these, the survey articles, [St94], and [CR94]
are written more as summaries of research than as texts for
students; while [G97] and [SM97] focus heavily on the (important)
applications in molecular biology. The final monograph [CHL01] does
indeed function as both a monograph and a textbook on string
algorithms, and is moreover both clearly and elegantly written;
unfortunately, it is currently available only in French. The purpose of
this book then is to begin to fill a gap: to provide a general
introduction to algorithms for computing patterns in strings that is
useful to experienced researchers
Preface in this and other fields, but that also is accessible to senior
undergraduate and graduate students. Let us linger a moment over
three of the words used in the preceding sentence: "accessible",
"algorithms", and "patterns". An overriding objective in this book is
to make the material accessible to students who have completed or
nearly completed a mathematics or computer science undergraduate
curriculum that has included some emphasis on discrete structures
and the algorithms that operate upon them. A first consequence of
this objective is that the mathematical background required to read
this book is general rather than specific to strings. It would certainly
be provided by the standard IEEE/ACM courses in Discrete
Mathematics, Data Structures, and Analysis of Algorithms. The
reader will know what stacks, queues, linked lists and arrays are, for
example, and will have some familiarity with the analysis of
algorithms and the "asymptotic complexity" notation used for this
purpose, some experience with mathematical assertions and the
methods used to establish their correctness, and some knowledge of
important algorithms on graphs and trees. In addition, the
assumption is made that the reader is familiar with some computer
programming language, and has the ability to read and understand
algorithms expressed in such a language. A second consequence is
that no claim is made to completeness: my objective is to lure the
student and the reader into a fascinating field, not to write an
encyclopaedia of algorithms that compute patterns in strings. In
particular, I have been selective in two main ways: I focus on results
that are (I believe) important and that moreover can be explained
with reasonable economy and simplicity. Inevitably, this means that
the exposition of some interesting and valuable material is omitted.
However, I hope that, both by providing references to much of this
material and by stimulating interest, I will encourage readers to
investigate the literature for themselves. The underlying subject
matter of this book is a mathematical object called by most
computer scientists a "string" (or, in Europe and by most
mathematicians, a "word"). But the focus of this book is on
algorithms — that is, on precise methods or procedures for doing
something — and it is thus more properly thought of as a text in
computer science rather than in mathematics. This book will
therefore take quite a different approach from that of the classic
monograph [L83], and its descendants [L97, L02], that elucidates
mathematical properties above all. We shall rather be interested
primarily in algorithms that find various kinds of patterns in strings;
for the most part, only as a byproduct of that focus will interest be
displayed in the mathematical properties of the strings themselves.
This does not mean that results will not be proved rigorously, only
that the selection of those results will generally depend on their
relevance to the behaviour of some algorithm. A final remark here:
in the exposition I confine myself to sequential algorithms on strings
in one dimension, making no reference to the extensive literature on
the corresponding parallel (especially PRAM) algorithms or to the
growing literature on multi-dimensional (especially two-dimensional)
strings. Another focus of this book will be on patterns. That is, the
algorithms we discuss will virtually all be devoted to finding some
sort of a pattern in a string. I say "some sort" of pattern, because
three main kinds will be distinguished — specific, generic, and
intrinsic — that provide this book with three of its four main
divisions. A specific pattern is one that can be specified by listing
characters in their required order; for example, if we were searching
for the pattern u = abaab in the string x = abaababaabaab we
would find it (three times), but we would not find the pattern u =
ababab.
Preface xj (Sometimes the pattern that we are looking for can
contain "don't-care" symbols, and sometimes the match that we
seek need only be "approximate" in some well-defined sense, but
these are refinements to be dealt with later.) By contrast, a generic
pattern is one that is described only by structural information of
some kind, not by a specific statement of the characters in it. For
example, we might ask for all the "repetitions" in x—that is, for all
cases in which two or more adjacent substrings of x are identical.
(The response in this case would be a list of repetitions including
(aba)(aba), (abaab) (abaab), aa (three separate times), and several
others — see if you can find them all.) I call the final kind of pattern
that we search for an intrinsic pattern — one that requires no
characterization, one that is inherent in the string itself. Here I
discuss various patterns that in one way or another expose the
periodic structure of a given string x; for example, normal form,
suffix tree, Lyndon decomposition, s-factorization. These intrinsic
patterns are used so frequently in algorithms that compute specific
or generic patterns as to be almost ubiquitous. Collectively, they
form the basis for the efficient processing of strings. The variety of
these intrinsic patterns is remarkable: the normal form of our
example string x is (abaababa) (abaab) while its Lyndon
decomposition is (ab)(aabab)(aab)(aab) and its s-factorization is (a)
(b)(a)(aba)(baaba)(ab) — all of these patterns are computationally
useful. The organization of this book is as follows. Part I gives basic
information about both strings and algorithms on strings. It provides
the terminology, notation and essential prop- properties of strings
that will be used throughout; in addition, it describes the main kinds
of algorithms to be presented and illustrates some of them on
certain famous strings; these strings are also "infamous" as
examples of worst case behaviour for many string algorithms. It is
particularly Chapter 2 that provides a kind of key to the rest of the
book: it explains precisely which problems are to be solved and
directs the reader to the later sections that present the algorithms
that solve them. Thus the book may be used fairly easily by the
reader who has selective interests. Part I also discusses qualities of
"good" algorithms on strings, and raises the interesting question of
how implementation of these algorithms should in practice be
validated. Parts II-IV deal with algorithms for computing intrinsic,
specific and generic patterns, respectively, as described above.
Altogether there are 13 chapters distributed over the four parts. As
indicated in the table of contents, these are broken down into
sections, each of which ends with a collection of exercises. Where
appropriate, chapters include a summary of the topics/algorithms
covered and a discussion of related results, additional topics and
open problems. A note about the exercises, of which there are some
500 or so: they are an integral part of the book, used for four main
purposes: M to make sure the reader has understood; m to clarify,
or to put into a different context, principles or ideas that have arisen
in the text — in a phrase, to make connections; m to handle
extensions or modifications of algorithms or mathematical results
that the reader should be aware of, but should not need to have
explained to him in detail; M to deal with details (of algorithms, of
proofs) that would otherwise unnecessarily clutter the presentation.
xii Preface Wherever possible, by explaining in the text only what
really needs to be explained, I have tried through the exercises to
involve the reader in the development or analysis of the algorithms
presented. What I myself have discovered by taking this approach is
that for the most part the algorithms, and the improvements to
algorithms, depend on a very simple new idea, an insight that is not
complicated but that has somehow previously escaped other
researchers. Once that idea is captured, what remains consists
mainly of technical- technicalities — tedious and convoluted perhaps,
but still a direct consequence of the main idea. This observation
seems to be true of string algorithms: I wonder what fields of study
it is not true of? A note also about the dreaded index: if on page p
of the text I have cited a work authored or co-authored by person P,
then the index entry for P should include p. I am very sensible of the
fact that sometimes I can be hasty (as Treebeard [T55] would say),
consequently error-prone. I have therefore spent a great deal of
time reworking this book in order to correct errors, rectify
oversights, or smooth over inelegancies; nevertheless I cannot
imagine that there will not be many defects to be found. I will be
maintaining a website http://www.cs.curtin.edu.au/
smyth/patterns.shtml to record corrections and suggestions for
improvement, and I would be grateful if readers would contact me at
smythQcomputing.edu.au Or smythSmcmaster.ca with their
comments. The material in this book is at least sufficient to cover
two one-semester A2-14 week) courses for graduate or advanced
undergraduate students. Indeed, I hope that it is more than
sufficient: I hope that it is also suitable. The initial chapters have
already been presented several times to graduate computer science
students in the Departments of Computer Science & Systems and of
Computing & Software at McMaster University, Hamilton, Ontario,
Canada; also to graduate students in the Department of Computer
Science, University of Debrecen, Hungary. These students have
contributed materially to the book's development. I wish particularly
to express my deep gratitude to the School of Computing, Curtin
University, Perth, Western Australia, and to its past and present
Heads of School, Dennis Moore, Terry Caelli, Svetha Venkatesh and
Geoff West, for generous support and encour- encouragement, both
intellectual and practical, over a period of several years. Most of this
book has been written during my sheltered visits to Curtin. I am
grateful also to Professor Petho Attila, Chair of the Department of
Computer Science at Debrecen, for his interest and sup- support: it
was in Debrecen late in 2001 that the last bits of ETgX were finally
keyed in. It is a pleasure to express my debt to my friends and
colleagues, Leila Baghdadi, Jerry Chappie, Franya Franek, Costas
Iliopoulos, Thierry Lecroq, Yin Li, Dennis Moore, Pat Ryan, Jamie
Simpson and Xiangdong Xiao for their valuable contributions. And
many thanks also to two anonymous referees whose constructive
comments have contributed materially to the final form of the book.
Finally, kudos to Jocelyn Smyth for her entertaining selection and
careful verification of "string" and "word" quotations! W F. S.
To my parents.
Algorithms
A word in time saves nine. — Anonymous
rties of Strings 1.1 Springs of Pearls Wprds form the thread on which
we string our experiences. - Aldous HUXLEY A894-1963), Brave New
World Consider a string of pearls. Imagine that the string is laid out
on the table before you, so that one end is on the left and the other
on the right. Suppose that there are n pearls in the string, and that
each pearl has a tiny label pasted on it. Suppose further that the
labels are integers in the range l..n and that they satisfy the
following rules: IS the label on the leftmost pearl is 1; M for every
integer i = 1,2,..., n - 1, the pearl to the right of the pearl labelled i
has labeli+l. These rules seem to satisfy our intuitive idea of what
makes a string of pearls a "string": the pearls all lie on a single well-
defined path, and the path can be traversed from one end to the
other by moving from the current pearl to an adjacent one.
Reflecting on the rules, however, we realize that we need not be so
specific. First of all, of course, we do not really need to speak of
"pearls": we can speak more generally of (undefined) elements. But
a second, more fundamental, observation is that the labels do not
need to be chosen in any spedfic order, and they do not need to be
integers: they could be colours, for example, or letters of the
alphabet. What really matters is that
Chapter 1. Properties of Strings @) every element has a label that is
unique; A) every element with some label x (except at most one,
called the leftmost) has a unique determinablepredecessor labelled
p(x); B) every element with some label x (except at most one, called
the rightmost) has a unique determinable successor labelled s(x); C)
whenever an element with label x is not leftmost, x = s(p(x)); D)
whenever an element with label x is not rightmost, x = p(s(x)); E)
for any two distinct elements with labels x and y, there exists a
positive integer k such that either x = sk(y) or x = pk(y). These
rules capture the essential idea of concatenation: each element has
either a unique predecessor or a unique successor, and, except at
the extremes, actually has both. Further- Furthermore, by following
a finite sequence of either successors or predecessors, we can reach
any element with label y from any other element with label x.
Fortified with these observations, then, we boldly state: Definition
1.1.1 A string is a collection of elements that satisfies rules 0-5. A
critical feature of this definition, not as yet discussed, is the
condition, included in rules 1 and 2, that there be at most one
leftmost or rightmost element. For consider what happens when the
clasp is fastened on the original string of pearls, forming a necklace.
Now there is no longer either a "leftmost" or a "rightmost" element
— but that turns out not to be a problem, since rules 0-5 continue to
apply. According to our definition, a necklace is also a string!
Furthermore, suppose that the number of pearls in the original string
were infinite: beginning at the lefthand edge of the table but
stretching away without end toward a forever unseen edge at the
right. We see that this infinite string also is covered by the definition:
it has a leftmost element, but no rightmost one. And, perhaps most
surprising of all, we see that even a string which extends to infinity
in both directions is covered by rules 0-5: in this case there is again
neither a leftmost nor a rightmost element. In this book we will at
various times become interested in all of these different kinds of
strings. To prevent confusion, therefore, we adopt the following
conventions. A string with a finite number of elements including both
a leftmost and a rightmost element will be called a linear string. A
string with a finite but nonzero number of elements and neither a
leftmost nor a rightmost element will be called a necklace (in the
literature also called a circular string). A string with an infinite
number of elements, of which one is leftmost, will be called an
infinite string; while a string with an infinite number of elements, of
Visit https://ebookmass.com today to explore
a vast collection of ebooks across various
genres, available in popular formats like
PDF, EPUB, and MOBI, fully compatible with
all devices. Enjoy a seamless reading
experience and effortlessly download high-
quality materials in just a few simple steps.
Plus, don’t miss out on exciting offers that
let you access a wealth of knowledge at the
best prices!
which none is either leftmost or rightmost, will be called an infinite
necklace. When the meaning is clear from the context, we will just
use the word "string" to refer to any object satisfying Definition
1.1.1. In practice it will be easy to distinguish between linear strings
and necklaces, because necklaces will normally be defined in terms
of a corresponding linear string x and written
Liriear Strings 5 C(x) — we think of C(x) as being the necklace
formed from x by making its leftmost element the successor of its
rightmost element. Exercises 1.1 1. Explain why concatenation rule 3
is required. Give an example of a mathematical object which satisfies
rules 0-2 but not rule i. 2. Can rule 4 for concatenation be derived
from rules 0-3? Explain your answer. 3. Explain why concatenation
rule 5 is required. Characterize the mathematical objects that satisfy
rules 0-4 but not rule 5. 4. Does the infinite set {a, <r \ of strings
contain an infinite string? 5. According to Definition 1.1.1, can a
string consist of a single element a? Could such a string be a
necklace? 6. Is Definition 1.1.1 satisfied by a string with no elements
in it (a so-called empty string)? Does the above definition of a linear
string include the empty string? What about the definition of a
necklace ? 7. In view of the preceding exercise, how many dilTerent
kinds of siring are included in the classification of strings given in
this section? 8. Our classification of strings omits the following
cases: (a) a string with an infinity of elements including a rightmost
one but no leftmost one; (h) a string with a finite number of
elemenrs including either a leftmost one or a rightmost one, but not
both. Comment on these omissions. 9. Is there any way to prove
that Definition 1.1.1 defines a string? 1.2 L He who has been bitten
by a snake fears a piece of string. near Strings — Persian proverb
Frbm the discussion of the preceding section, it becomes clear that
the idea of a string, though a simple one, is also very general. A
string might be
Chapter 1. Properties of Strings ¦ a word in the English language,
whose elements are the upper and lower case English letters
together with apostrophe (') and hyphen (-); ¦ a text file, whose
elements are the ASCII characters; ¦ a book written in Chinese,
whose elements are Chinese ideographs; ¦ a computer program,
whose elements are certain "separators" (space, semicolon, colon,
and so on) together with the "words" between separators; ¦ a DNA
sequence, perhaps three billion elements long, containing only the
letters C, G, A and T, standing for the nucleotides cytosine, guanine,
adenine and thymine, respectively; ¦ a stream of bits beamed from a
space vehicle; ¦ a list of the lengths of the sides of a convex
polygon, whose values are drawn from the real numbers. All of these
examples are instances of what we have called in Section 1.1a
"linear string". Indeed, most of this book will deal with linear strings,
and so in this section we introduce notation and terminology useful
for talking about them. Much, but not all, of this terminology will
also apply to necklaces, infinite strings, and the empty string. The
examples make clear that an important feature of any string is the
nature of its elements: bits, members of {C, G, A, T}, real numbers,
as the case may be. It is in fact customary to describe a string by
identifying a set of which every element in the string is a member.
This set is called an alphabet, and so naturally its members are
referred to as letters — though, as we have seen, the term "letter"
must be interpreted much more broadly than is usual in English. We
say then that a string is defined on its alphabet. Of course, if a string
x is defined on an alphabet A, then x is also defined on any superset
of A, so an alpha- alphabet for x is not unique. A minimum alphabet
for x is just the set of all the distinct elements that actually occur in
x. Sometimes it is convenient to define the alphabet of a string as
the minimum one ("bits"), sometimes as a set that is far from
minimum ("real numbers"). Throughout this book A will denote an
alphabet and a = \A\ its cardinality. In the common cases that a is
2,3 or 4, we say that the alphabet A is binary, ternary or quaternary,
respectively; as we shall see, there are many interesting strings on a
binary alphabet, and a quaternary alphabet is of particular
importance because of applications to the analysis of DNA
sequences. In general, apart from the distinctness of the elements of
A that follows from the set property, we place no other restriction
upon them: the elements of the alphabet may be finite (even zero!)
in number, countably infinite (for example, the integers), or
uncountably infinite (for example, the reals). And the elements of
the alphabet may be totally ordered (so that a comparison of any
distinct pair of them yields the result "less" or "greater"), unordered,
or somewhere in between ("partially ordered"). For many of the
algorithms discussed in this book, it will be sufficient to use an
unordered alphabet; as discussed in Chapter 4, the nature of the
alphabet on which an algorithm operates is very important to the
selection of test data for the algorithm as well as to its
computational efficiency. For a given alphabet A, let A+ be the set of
all possible nonempty finite concatenations of the letters of A. Thus,
for example, if A = {a}, then A+ = {a, a2, }, where we write a2 for
aa, a3 for aaa, etc.; and if A = {0,1}, then A+ consists of all distinct
nonempty finite sequences of bits, and so may be thought of as
including all the nonnegative integers. As suggested in Exercise
1.1.6, it is convenient also to introduce the idea of the empty string,
usually written e, which we use to define the sets A' — A U {e} and
A* = A+ U {e}.
Linear Strings 7 Thi.5 terminology allows us to express another
definition of linear strings equivalent to that given in the previous
section: Definition 1.2.1 An element of A+ is called a linear string on
alphabet A. An element of A* is called a finite string on A. Thu s A+
is the set of all linear strings on a given alphabet A. Note that the
empty string is not a linear string: after all, it has neither a leftmost
nor a rightmost element! ", throughout this book strings will
consistently be denoted by boldface letters, almost always lower
case: p, t, x, and so on.,We will implement the rules 0-5 of Section
1.1 by treating strings as one-dimensional arrays; alternatively, we
might have used a linked-list or some other representation, but
arrays are a simple and natural model, as we have seen with the
string of pearls. Thus for any string, say x, containing n > 0 letters
drawn from an alphabet A, the implicit declaration will be x : array
[l..n] of A. In this case we will say that the string has length n = \x\
and positions 1,2,..., n. For any integer % e l..n, the letter in position
% is x[i], so that we may write x = aj[l]ai[2] • • -x[n], which we
recognize from Definition 1.1.1 as a concatenation of n strings of
length 1. In fact, in this formulation the position i plays the role of
the label introduced in Definition 1.1.1. Note that the array model
works also for the empty string x = e which corresponds to an e
mpty array and has length 0. Digression 1.2.1 We have said that
arrays are a "simple and natural" representation of strings, a
statement that obscures a significant computational issue. Certainly
an array is a si nple data structure, but whether or not it is natural
depends on assumptions about the mechanisms by which the
elements of the array are accessed. As we have seen, strings are
defined in terms of concatenation, and so it would seem to be
"natural" to access eleraents using the successor (next) and
predecessor (previous) operations introduced in Sec pos ion 1.1.
These are mechanisms compatible with a linked-list representation,
where access to an element at position i from the "current" position j
would at least require time proportional to \j — i|. However, an
arbitrary element in a computer array can normally be accessed in
constant time simply by specifying its location i, quite independent of
the array tion j most recently visited. An array representation of
strings is thus more powerful than a liiked-list representation, and so
arguably not suitable, since it could justify execution time estimates
for algorithms lower than those attainable using list processing. In
practice, algorithms on strings almost always begin either at the left
(position 1) or at tl le right (position n), and inspect adjacent
(successor or predecessor) positions one by one. On the other hand,
the output of string algorithms often specifies string positions, the
imp licit assumption being that the user can access these positions in
constant time. One way
8 Chapter 1. Properties of Strings to reconcile these different models
of string access is to suppose that strings are initially available as
linked-lists — so that their elements are accessible only one-by-one,
either left-to-right or right-to-left — but copied as they are input into
an array. This copying, if it were necessary, would require only 0(n)
time and Q(n) additional space, and so would not affect the
asymptotic complexity of any of the algorithms considered in this
book. We therefore adopt the rather odd convention that strings are
processed as linked-lists on input, but may in some cases be
regarded as arrays on output. We promise to alert the reader if ever
we deviate from this convention. ? Equality between strings is
defined in an obvious way. A string x of length n and a string y of
length m are said to be equal (written x = y) if and only if n = m
and x[i] = y[i] for every i = 1,..., n. Thus the empty string e is equal
only to itself. Note further that, by this definition, prepending or
appending e to a given string x does not change its value; thus we
may if we please write x = exe. Corresponding to any pair of
integers i and j that satisfy 1 < % < j < n, we may define a
substring x[i..j] of x as follows: We say that x [i. .j] occurs at
position i of x and that it has length j—i+l.lfj—i+l < n, then x [i. .j] is
called a proper substring of x. Two noteworthy kinds of substring are
x = x [1. .n] of length n, andx[i] = x[i..i] of length 1. For every pair
of integers i and j such that % > j, we adopt the convention that
ccfi.j] = e, a substring of length zero. As we have already seen,
since x = exe, e may be regarded as a substring of any element of
A*. Let k ? 1. .n be an integer, and consider positions ik = 1,2,..., n
— k +1 of x. Each of these n — k + 1 positions represents the
starting position of a substring x[ik-.ik + k — 1] of length k. Thus
every string of length n has n — k + 1 (not necessarily distinct)
substrings of length k. If u is a substring (respectively, proper
substring) of x, then x is said to be a superstring (respectively,
proper superstring) of u. Of course substrings and superstrings are
also strings. If A is ordered, we may use the substring notation to
define a corresponding induced order on the elements of A* called
lexicographic order — that is, dictionary order. More precisely,
suppose we are given two strings x = x[l..n] and y = y[l..m], where
n > 0, m > 0. We say that x < y (x is lexicographically less than y) if
and only if one of the following (mutually exclusive) conditions
holds: H n < m and cc[l..n] = j/[l..n] (as we shall see shortly, this is
the case in which x is a "proper prefix" of y); m x[l..i — 1] = y[l..i -
1] and x[i] < y[i] for some integer % G 1.. min{n, m} (this is the
case in which there is a first position i in which x and y differ). Then,
for example, using the order of the English alphabet: ¦ ab < abc
(because i = 2 = n<3 = m); ¦ e < a (because i = 0 = n<l = m); ¦
ab < aca (because i = 2 and* < c).
of has Linear Strings 9 Observe that this definition is valid also in
cases where one or both of n, m is infinite; that is, also for infinite
strings. Based on this definition, the other order relations (<, >, >)
are defined in the usual way: x < y if and only if x = y or x < y, x >
y if and only if y < x, x > y if and only if y < x. Writing x = uiu2 • • •
uk where the ui are nonempty substrings, i G l..k, is called
afcictorization or decomposition of cc into factors ui (see Section 1.4
and Chapter 6). Thus a factor is just a nonempty substring. There
are two special kinds of substring x[i..j] which are of particular
importance, and to which we give special names. For any integer ,7
G O..n, we say that x[l..j] is aprefix of x, sometimes written pref(cc);
if in fact j < n, we say that xfl.j] is a proper prefix of x. Similarly, for
any integer i G l..n + 1, we say that x[i..n] is a suffix of x, written
suff(cc), and a proper suffix if i > 1. Note that, in accordance with
the identity x = exe, these definitions allow us to include the empty
string e as both a proper prefix and a proper suffix :. Thus, for
example, the string / = abaababaabaab prefixes e, a, ab, aba, ...,/ =
abaababaabaab and suffixes e, 6, ab, aab,...,/. The proper prefixes
and suffixes of / are obtained simply by omitting / itself from these
lists. A concept whose value is not immediately apparent, but which
we will find to be useful in many different contexts, is that of a
"border". Definition 1.2.2 A border b ofx is any proper prefix ofx that
equals a suffix qfx. We see that, according to this definition, x
always has an empty border b = e of length -- 0, but that x itself is
not a border of x. In general, we use the symbol E to denote the
length |6| of b. Often we will be particularly interested in the longest
border, denoted by with length /?* = |6*|, where 0 < /?* < n — 1.
The string / introduced above has two nonempty borders: ab and
abaab. The string g = abaabaab has exactly the same two borders,
but observe that in this case the longest one, abaab, overlaps with
itself. Similarly, the string an has borders a, a2,... ,an~\ of which, for
i =¦. |~(n + l)/2], the borders a\ ai+1,..., a" overlap. As we shall
discover presently, overlapping borders are in fact characteristic of
strings that contain repetitive substrings: in the above example,
observe that g can be written in the form (abaJab = (aba)(aba)ab.
10 Chapter 1. Properties of Strings thus representing it as a string in
which two occurrences of aba are followed by a prefix of aba. We
now apply the idea of a border to generalize this observation and to
derive what we call a "normal form" for a given nonempty string x of
length n. Suppose that a border b and its length j3 have been
computed. (We shall see in Section 1.3 how to compute every
border of x in 6(n) time.) By Definition 1.2.2 it must be true that
x[l..p]=x[n-0 + l..n], A.1) from which we see that the quantity p =
n- C>l measures the displacement between positions of x that are
required to be equal. (Observe that the larger the value of C, the
smaller the value of p.) Thus, for every integer i e 1..0, it must be
true that x[i] = x[i+p]. A.2) In particular, if C = 0 (p = n), we see
that A.1) and A.2) are trivially true; while if 2C > n Bp < n), then x
must contain at least two equal adjacent substrings ai[L.p] and x\p
+ 1..2p]. More precisely, we see that x consists of \n/p\ identical
substrings, each of length p, followed by a possibly empty suffix of
length n mod p. Setting r = n/p and letting u = ai[l..p], we see that
the values p and r which we have derived from C permit us to
express any string x in the form * = uLrV, A.3) where u' = x [l..n —
[rjp] is a proper prefix (possibly empty) of u. Alternatively, we can
separate r into its integral and fractional parts by writing r = [rj+k/p
for some integer A; G O..p — 1. Then, interpreting uk/p = u[l..p]k/p
to mean simply tt[l..A;], we find that A.3) can be rewritten in the
compact form x = ur. A.4) We call p aperiod and r an exponent of cc.
The prefix u = x [1. .p] we call a generator of cc. Note that since
every string x has an empty border b = e, it therefore has a trivial
period p = n, a trivial exponent r = 1, and a trivial generator x.
Looking over the previous paragraph, we see that what has
essentially been done is to compute a period p = p{C) and a
corresponding exponent r = r(C) as functions of 13. It is clear that p
is monotone decreasing and r monotone increasing in j3. Therefore
with the choice C = C*, p achieves its minimum value p* and r its
maximum r*, the minimum period and the maximum exponent
respectively. Generally, the values p* and r* will be the ones we are
most interested in, and so, when there is no ambiguity, we will
simply refer to p* as the period and r* as the exponent. Similarly,
we refer to u = ai[l..p*] as the generator.
Linoar Strings 11 Definition 1.2.3 Let p* be the minimum period of x
= x[l..n], and let r* = n/p*, u = x[l..p*]. Then the decomposition x
= ur A.5) is called the normal form ofx. of Th stro peri som sen be
ide ino Tie normal form A.5) leads to a useful and important
taxonomy, or classification system, trings: if r* = 1, we say that x is
primitive; otherwise, x is periodic; if r* > 2, we say that x is strongly
periodic; if 1 < r* < 2, x is said to be weakly periodic; if r* > 2 is an
integer, we say that a; is a repetition (or, equivalently, that x is
repetitive); in the special cases that r* = 2 or 3, x is called a square
or a cube, respectively. ; we see that x must be either primitive or
periodic, and, if it is periodic, then one of igly periodic or weakly
periodic; further, if x is repetitive, then it must also be strongly 3dic.
Observe that r* > 2 if and only if x has a border of length 0 > n/2.
Here are e examples of these definitions: x = aaabaabab is primitive
(p* =n); f = abaababaabaab = (abaababa) (abaab) is weakly
periodic with period p* = 8, exponent r* = 13/8 and generator
abaababa; g = abaabaab = (abaJab is strongly periodic with period
p* = 3, exponent r* =8/3 and generator aba; x = (abL is repetitive
with period p* = 2, exponent r* = 4 and generator ab; x = (abcabdJ
is a square with period p* = 6 and generator abcabd. Ve remark
that the normal form A.5) is actually a kind of "intrinsic" pattern, in
the e of Part II of this book: every string x has the pattern called a
"normal form" that can ed to assign it its place in a taxonomy of
strings. We remark further that the simple introduced in this section
(e.g. border, primitive, period) will recur again and again ir
discussions of various string algorithms. Exercises 1.2 1. Try to
reconcile the definitions of linear string given in Sections 1.1 and L2;
that is, to prove that a linear string according to one definition is
necessarily a linear string according to the other.
12 Chapter 1. Properties of Strings 2. Suppose that an alphabet A
contains exactly a letters. Given some nonnegative integer n, how
many elements of A* have length n? How many have length at most
n? 3. It was remarked above that for A = {0,1}, A+ may be thought
of as including all the nonnegative integers. How many times is each
integer included? 4. Based on the definition of equality in strings, is
it true that e = e2? Justify your answer. 5. Using the definitions of
equality and lexicographic order, prove that for arbitrary strings x
and y on an ordered alphabet, x = y if and only ifxyty and y ft x. In
particular, demonstrate that this result holds in the case x = y = e,
and thus show that e is the unique lexicographically least element of
A*. 6. Based on the usual ordering of the English lower-case letters,
arrange the following strings in increasing lexicographical order:
abbae, abbba, abb, abc, a, e2, ab, aba, eb. 7. Prove that the
operator < as we have defined it satisfies transitivity; that is, that x
< y and y < z =» x < z. 8. Give an independent definition of the
order relation >, then use it together with the definition of < given
in the text to show that x > y if and only if y < x. 9. Given a string x
of length n, find the length of the following substrings of x, and state
the conditions on i and k for which your answer is valid: (a) x[i..i + k
- 1); (b) x[i-k + l..i]: (c) a:[i + l..fc-l]; (d) ea:[l.Jfc]. 10. What is the
maximum number of distinct substrings that there can possibly be in
a string x of length r>? Give an example of a string that attains this
maximum. Then characterize the set of strings of length n that
attains it. 11. In the preceding exercise there is an unstated
assumption that the alphabet size should be regarded as
unbounded. Vaiyi Sandor suggests that the question becomes more
interesting (and much more difficult) if the size a of the alphabet is
finite and fixed. What progress can you make with this apparently
unsolved research problem? Hint: Perhaps a good place to start is
with the following question: for given positive integers a. = \A\ and
k, what is the longest string on alphabet A that contains no substring
of length A; more than once? 12. Describe an algorithm that
computes all the distinct substrings in x. Establish your algorithm's
correctness and asymptotic complexity (try to achieve 0Gi2)).
Linear Strings 13. If y is a nonempty string of length m and k is a
positive integer, determine \x.\ as a function of m and k in each of
the following cases: (a) x = yk; (b) x = yM; (c) x = yM\ 14. What is
the length of the string x -= {ab)nu[ab)n-ln ¦ ¦ • (abJa[abW? 15. Are
the ideas of prefix, suffix and border defined for the empty string e?
16. Determine the longest border, the period, and the exponent of
each of the following strings, and hence classify each one as
primitive, strongly periodic, weakly periodic, or repetitive: (a)
abaababaabaababaababu: (b) abcabacabcbacbcacb; (c)
abcabdubcabdabcabri; Cd) 17. A string x of length n is said to be a
palindrome if it reads backwards the same a.s forwards; more
precisely, if for every I -- 1.2 !_n/^j' x'f. = xln "~L + J j- Jamie
Simpson believes every border of a palindrome is itself a palindrome.
Prove him right or wrong. Remark: To inspect some nontrivial
palindromes, you could consult the following URLs:
www.growndodo.com/wordplay/palindrcmes/dogseesada.html
complex.gmu.edu/neural/personnel/ernie/witty/palindromes.html 18.
Show that period and exponent are "well-defined"; in other words,
whcne\er x = uku' for some string u, some positive integer k, and a
proper prefix u' of //, that there exists a unique corresponding
border. 19. Show that if ur* is the normal form of as. then u is not a
repetition. (This fact becomes important in Section 2.3 when we
consider the encoding of the repetitions in a string.) 20. Consider
(ab)*' = aba. What is ({ab)'A- 2J7 Is it (abn«-? Or is it (ab)W- =
(n6J3? 21. Show that no string has two distinct primitive generators.
22. Can you find a way to compute the number of strings on {a, b\
that arc of length // and primitive?
14 Chapter 1. Properties of Strings Hint: Observe that for odd n, an
arbitrary string sc[l..n] can be formed from strings x [l..(n - l)/2] and
x [(n + l)/2..n], and for even n from strings x[l..n/2] and x[n/2 +
l..n]. If on due reflection you still have trouble with this exercise,
consult [GO81a.GO81b]. 23. The taxonomy of strings given above is
expressed in terms of values r*. Re-express it in terms of values of
the longest border /?*. 24. Observe that what we have called the
"normal form" of x should really more properly be called the "left"
Visit https://ebookmass.com today to explore
a vast collection of ebooks across various
genres, available in popular formats like
PDF, EPUB, and MOBI, fully compatible with
all devices. Enjoy a seamless reading
experience and effortlessly download high-
quality materials in just a few simple steps.
Plus, don’t miss out on exciting offers that
let you access a wealth of knowledge at the
best prices!
Exploring the Variety of Random
Documents with Different Content
When the train started again he tried to sleep, but his brain was
too excited. He had not slept for three nights. Yet the feelings of
prostration that had come upon him just before Lizzie’s death had
passed away, giving place to one of intense vitality. Every fibre in his
body was alive. Sleep was scarcely necessary. Only a shooting pain
now and then in his head made him start and pass his hand
impatiently across his forehead. The train thundered on through the
darkness, and Goddard remained awake, possessed by the
passionate intensity of his fixed idea. He watched the day dawn,
bright and glorious. At Avignon the world was bathed in sunshine. It
was an omen of happiness. At Marseilles it was hot. All along that
beautiful coast Goddard’s heart glowed within him. The deep-
coloured sea, the flowers, the light, the joyousness of the south
filled his senses with the wonder of a new world. His silent
companions got out at Toulon, and three swarthy Gascons took their
place, and talked with rich deep voices and extravagant gestures
until they reached Camoules, their destination. Goddard missed their
whole-hearted laughter when they had gone.
The day wore on. Cannes at four o’clock. In a few moments he
would be in Nice. He drew once more the letter from his pocket,
rested his eyes on the few words a long, long time. “Whatsoever
your heart desireth— Rhodanthe.” He looked out at the deep blue
water meeting the violet sky. Rhodanthe! The name was strangely in
harmony with this exotic beauty. Before the night was over he would
call her by it. She would be his. Together they would conquer the
world.
He stepped on to the platform at Nice like a king coming to take
possession of a new realm. He looked around, as if he should see
Lady Phayre awaiting him, and then smiled at the fancy. The hotel
porter took his luggage to the Hôtel Terminus, the nearest. He was
feverishly anxious to set out on his quest of her without loss of time.
A quarter of an hour sufficed him to wash and make himself
presentable, and then he went out into the Avenue de la Gare. At
another time he would have loved to walk down the beautiful
boulevard, bright with shops and cafés and gaily coloured kiosques;
but now the supreme hour of his life had come, and the great
thoroughfare became blurred as in a dream. He hailed a cab, gave
the address “Hôtel des Anglais” to the driver, and sat bolt upright all
the way, in an agony of impatience. He had no eyes now for the sea
as he emerged on to the Promenade des Anglais; but he scanned
the long line of palace-hotels, wondering which was Lady Phayre’s.
The cab stopped by the public gardens. Goddard looked up. It was
the Hôtel des Anglais. He threw a piece of money to the cabman,
and entered.
The frock-coated, brass-buttoned porter approached him in polite
inquiry.
“I want to see Lady Phayre,” said Goddard.
“I am afraid, sir,” replied the man, “that Lady Phayre has gone
away this very morning.”
“Gone away?” asked Goddard, looking at him blankly. “Where to?”
“Ah, that I cannot say,” said the porter.
And then he added, with the benevolent smile of his class—
“Perhaps you have not heard, sir, that there is no longer such a
person as Lady Phayre.”
“What?” cried Goddard. “What do you mean?”
“Only that Madame was married this morning. It was to a
Monsieur Gleam. I believe he is a member of Parliament. He has
been staying in the hotel.”
Goddard stared at him with a ghastly face. He turned slowly and
went down the hotel steps. He staggered a few yards. Then the sea,
and the trees, and the great white palaces mingled together in a
whirling circle, and disappeared in the blackness of night. Something
in his brain seemed to snap, and he fell an inert mass on the
pavement.
For weeks he lay ill. He recovered to wish that he had died.
Despair overwhelmed him. His crime haunted him waking and
sleeping. In his bodily prostration he seemed to hear the mocking
laughter of the fiend that had prompted it. With the torture of
remorse was paradoxically mingled impotent anger at the cynicism
of fate. His soul sickened at the futility of things. He shrank with
shuddering dismay from the ordeal that lay before him. There were
times when death beckoned to him with tempting hands.
But men of Goddard’s stamp survive the shipwreck of their
happiness. They live on, and go about the world’s work doggedly,
stubbornly, blindly obeying the fighting instinct within them. The
great tragedies of the soul culminate not in death, but in dragging
years of life, when the grasshopper is a burden and desire fails. And
such is the end of Daniel Goddard’s tragedy. He lives to-day. His
name is a household word. He is the coming man, not of a party-
clique, but of a nation. He has sat upon the Treasury Bench. In the
next Liberal Administration he will hold Cabinet rank. He is envied,
courted, flattered. The wildest ambitions of his boyhood are in
course of certain fulfilment. But he has lost for ever the joy of
victory; the springs of happiness are for ever closed by the one
overwhelming defeat of his life.
He is on the best of terms with Aloysius Gleam, and attends his
wife’s dinner-parties. Between them the past has only once been
referred to, and that silently. It was the first time he found himself
alone with her, one evening after dinner, Gleam having been
summoned from the drawing-room. Their eyes met for an
embarrassing moment. Then Goddard drew the familiar letter from
his pocket-book, held it out for a few seconds so as to catch her eye,
and threw it into the fire. She watched it blaze, and gave two or
three little nods of acknowledgment. Then, being in a comfortable
chair, a bewitching costume, and a considerably relieved frame of
mind, she allowed the moisture to gather in her eyes. But neither
spoke until Gleam returned with a sprightly saying on his lips. He
threw himself into a chair.
“An old servant has just been to return me a sovereign she once
stole. It weighed on her conscience. I asked her about a certain
diamond pin. She looked haggard, and fled incontinently. Verily, all is
for the funniest in this funniest of all possible worlds.”
Rhodanthe broke into her silvery laugh. Goddard joined in grimly
and looked at her. For desire of her he had committed murder. He
was laughing and jesting with her husband and herself. Gleam was
right. It was the most humorous of worlds.
Then his mind went back to the terrible moment of his life, and his
heart gave a great heave, and his lips moved noiselessly.
“God, forgive me!”
THE END
*** END OF THE PROJECT GUTENBERG EBOOK THE DEMAGOGUE
AND LADY PHAYRE ***
Updated editions will replace the previous one—the old editions will
be renamed.
Creating the works from print editions not protected by U.S.
copyright law means that no one owns a United States copyright in
these works, so the Foundation (and you!) can copy and distribute it
in the United States without permission and without paying
copyright royalties. Special rules, set forth in the General Terms of
Use part of this license, apply to copying and distributing Project
Gutenberg™ electronic works to protect the PROJECT GUTENBERG™
concept and trademark. Project Gutenberg is a registered trademark,
and may not be used if you charge for an eBook, except by following
the terms of the trademark license, including paying royalties for use
of the Project Gutenberg trademark. If you do not charge anything
for copies of this eBook, complying with the trademark license is
very easy. You may use this eBook for nearly any purpose such as
creation of derivative works, reports, performances and research.
Project Gutenberg eBooks may be modified and printed and given
away—you may do practically ANYTHING in the United States with
eBooks not protected by U.S. copyright law. Redistribution is subject
to the trademark license, especially commercial redistribution.
START: FULL LICENSE
THE FULL PROJECT GUTENBERG LICENSE
PLEASE READ THIS BEFORE YOU DISTRIBUTE OR USE THIS WORK
To protect the Project Gutenberg™ mission of promoting the free
distribution of electronic works, by using or distributing this work (or
any other work associated in any way with the phrase “Project
Gutenberg”), you agree to comply with all the terms of the Full
Project Gutenberg™ License available with this file or online at
www.gutenberg.org/license.
Section 1. General Terms of Use and
Redistributing Project Gutenberg™
electronic works
1.A. By reading or using any part of this Project Gutenberg™
electronic work, you indicate that you have read, understand, agree
to and accept all the terms of this license and intellectual property
(trademark/copyright) agreement. If you do not agree to abide by all
the terms of this agreement, you must cease using and return or
destroy all copies of Project Gutenberg™ electronic works in your
possession. If you paid a fee for obtaining a copy of or access to a
Project Gutenberg™ electronic work and you do not agree to be
bound by the terms of this agreement, you may obtain a refund
from the person or entity to whom you paid the fee as set forth in
paragraph 1.E.8.
1.B. “Project Gutenberg” is a registered trademark. It may only be
used on or associated in any way with an electronic work by people
who agree to be bound by the terms of this agreement. There are a
few things that you can do with most Project Gutenberg™ electronic
works even without complying with the full terms of this agreement.
See paragraph 1.C below. There are a lot of things you can do with
Project Gutenberg™ electronic works if you follow the terms of this
agreement and help preserve free future access to Project
Gutenberg™ electronic works. See paragraph 1.E below.
1.C. The Project Gutenberg Literary Archive Foundation (“the
Foundation” or PGLAF), owns a compilation copyright in the
collection of Project Gutenberg™ electronic works. Nearly all the
individual works in the collection are in the public domain in the
United States. If an individual work is unprotected by copyright law
in the United States and you are located in the United States, we do
not claim a right to prevent you from copying, distributing,
performing, displaying or creating derivative works based on the
work as long as all references to Project Gutenberg are removed. Of
course, we hope that you will support the Project Gutenberg™
mission of promoting free access to electronic works by freely
sharing Project Gutenberg™ works in compliance with the terms of
this agreement for keeping the Project Gutenberg™ name associated
with the work. You can easily comply with the terms of this
agreement by keeping this work in the same format with its attached
full Project Gutenberg™ License when you share it without charge
with others.
1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside the
United States, check the laws of your country in addition to the
terms of this agreement before downloading, copying, displaying,
performing, distributing or creating derivative works based on this
work or any other Project Gutenberg™ work. The Foundation makes
no representations concerning the copyright status of any work in
any country other than the United States.
1.E. Unless you have removed all references to Project Gutenberg:
1.E.1. The following sentence, with active links to, or other
immediate access to, the full Project Gutenberg™ License must
appear prominently whenever any copy of a Project Gutenberg™
work (any work on which the phrase “Project Gutenberg” appears,
or with which the phrase “Project Gutenberg” is associated) is
accessed, displayed, performed, viewed, copied or distributed:
This eBook is for the use of anyone anywhere in the
United States and most other parts of the world at no
cost and with almost no restrictions whatsoever. You
may copy it, give it away or re-use it under the terms
of the Project Gutenberg License included with this
eBook or online at www.gutenberg.org. If you are not
located in the United States, you will have to check the
laws of the country where you are located before using
this eBook.
1.E.2. If an individual Project Gutenberg™ electronic work is derived
from texts not protected by U.S. copyright law (does not contain a
notice indicating that it is posted with permission of the copyright
holder), the work can be copied and distributed to anyone in the
United States without paying any fees or charges. If you are
redistributing or providing access to a work with the phrase “Project
Gutenberg” associated with or appearing on the work, you must
comply either with the requirements of paragraphs 1.E.1 through
1.E.7 or obtain permission for the use of the work and the Project
Gutenberg™ trademark as set forth in paragraphs 1.E.8 or 1.E.9.
1.E.3. If an individual Project Gutenberg™ electronic work is posted
with the permission of the copyright holder, your use and distribution
must comply with both paragraphs 1.E.1 through 1.E.7 and any
additional terms imposed by the copyright holder. Additional terms
will be linked to the Project Gutenberg™ License for all works posted
with the permission of the copyright holder found at the beginning
of this work.
1.E.4. Do not unlink or detach or remove the full Project
Gutenberg™ License terms from this work, or any files containing a
part of this work or any other work associated with Project
Gutenberg™.
1.E.5. Do not copy, display, perform, distribute or redistribute this
electronic work, or any part of this electronic work, without
prominently displaying the sentence set forth in paragraph 1.E.1
with active links or immediate access to the full terms of the Project
Gutenberg™ License.
1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if you
provide access to or distribute copies of a Project Gutenberg™ work
in a format other than “Plain Vanilla ASCII” or other format used in
the official version posted on the official Project Gutenberg™ website
(www.gutenberg.org), you must, at no additional cost, fee or
expense to the user, provide a copy, a means of exporting a copy, or
a means of obtaining a copy upon request, of the work in its original
“Plain Vanilla ASCII” or other form. Any alternate format must
include the full Project Gutenberg™ License as specified in
paragraph 1.E.1.
1.E.7. Do not charge a fee for access to, viewing, displaying,
performing, copying or distributing any Project Gutenberg™ works
unless you comply with paragraph 1.E.8 or 1.E.9.
1.E.8. You may charge a reasonable fee for copies of or providing
access to or distributing Project Gutenberg™ electronic works
provided that:
• You pay a royalty fee of 20% of the gross profits you derive
from the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”
• You provide a full refund of any money paid by a user who
notifies you in writing (or by e-mail) within 30 days of receipt
that s/he does not agree to the terms of the full Project
Gutenberg™ License. You must require such a user to return or
destroy all copies of the works possessed in a physical medium
and discontinue all use of and all access to other copies of
Project Gutenberg™ works.
• You provide, in accordance with paragraph 1.F.3, a full refund of
any money paid for a work or a replacement copy, if a defect in
the electronic work is discovered and reported to you within 90
days of receipt of the work.
• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.
1.E.9. If you wish to charge a fee or distribute a Project Gutenberg™
electronic work or group of works on different terms than are set
forth in this agreement, you must obtain permission in writing from
the Project Gutenberg Literary Archive Foundation, the manager of
the Project Gutenberg™ trademark. Contact the Foundation as set
forth in Section 3 below.
1.F.
1.F.1. Project Gutenberg volunteers and employees expend
considerable effort to identify, do copyright research on, transcribe
and proofread works not protected by U.S. copyright law in creating
the Project Gutenberg™ collection. Despite these efforts, Project
Gutenberg™ electronic works, and the medium on which they may
be stored, may contain “Defects,” such as, but not limited to,
incomplete, inaccurate or corrupt data, transcription errors, a
copyright or other intellectual property infringement, a defective or
damaged disk or other medium, a computer virus, or computer
codes that damage or cannot be read by your equipment.
1.F.2. LIMITED WARRANTY, DISCLAIMER OF DAMAGES - Except for
the “Right of Replacement or Refund” described in paragraph 1.F.3,
the Project Gutenberg Literary Archive Foundation, the owner of the
Project Gutenberg™ trademark, and any other party distributing a
Project Gutenberg™ electronic work under this agreement, disclaim
all liability to you for damages, costs and expenses, including legal
fees. YOU AGREE THAT YOU HAVE NO REMEDIES FOR
NEGLIGENCE, STRICT LIABILITY, BREACH OF WARRANTY OR
BREACH OF CONTRACT EXCEPT THOSE PROVIDED IN PARAGRAPH
1.F.3. YOU AGREE THAT THE FOUNDATION, THE TRADEMARK
OWNER, AND ANY DISTRIBUTOR UNDER THIS AGREEMENT WILL
NOT BE LIABLE TO YOU FOR ACTUAL, DIRECT, INDIRECT,
CONSEQUENTIAL, PUNITIVE OR INCIDENTAL DAMAGES EVEN IF
YOU GIVE NOTICE OF THE POSSIBILITY OF SUCH DAMAGE.
1.F.3. LIMITED RIGHT OF REPLACEMENT OR REFUND - If you
discover a defect in this electronic work within 90 days of receiving
it, you can receive a refund of the money (if any) you paid for it by
sending a written explanation to the person you received the work
from. If you received the work on a physical medium, you must
return the medium with your written explanation. The person or
entity that provided you with the defective work may elect to provide
a replacement copy in lieu of a refund. If you received the work
electronically, the person or entity providing it to you may choose to
give you a second opportunity to receive the work electronically in
lieu of a refund. If the second copy is also defective, you may
demand a refund in writing without further opportunities to fix the
problem.
1.F.4. Except for the limited right of replacement or refund set forth
in paragraph 1.F.3, this work is provided to you ‘AS-IS’, WITH NO
OTHER WARRANTIES OF ANY KIND, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR ANY PURPOSE.
1.F.5. Some states do not allow disclaimers of certain implied
warranties or the exclusion or limitation of certain types of damages.
If any disclaimer or limitation set forth in this agreement violates the
law of the state applicable to this agreement, the agreement shall be
interpreted to make the maximum disclaimer or limitation permitted
by the applicable state law. The invalidity or unenforceability of any
provision of this agreement shall not void the remaining provisions.
1.F.6. INDEMNITY - You agree to indemnify and hold the Foundation,
the trademark owner, any agent or employee of the Foundation,
anyone providing copies of Project Gutenberg™ electronic works in
accordance with this agreement, and any volunteers associated with
the production, promotion and distribution of Project Gutenberg™
electronic works, harmless from all liability, costs and expenses,
including legal fees, that arise directly or indirectly from any of the
following which you do or cause to occur: (a) distribution of this or
any Project Gutenberg™ work, (b) alteration, modification, or
additions or deletions to any Project Gutenberg™ work, and (c) any
Defect you cause.
Section 2. Information about the Mission
of Project Gutenberg™
Project Gutenberg™ is synonymous with the free distribution of
electronic works in formats readable by the widest variety of
computers including obsolete, old, middle-aged and new computers.
It exists because of the efforts of hundreds of volunteers and
donations from people in all walks of life.
Volunteers and financial support to provide volunteers with the
assistance they need are critical to reaching Project Gutenberg™’s
goals and ensuring that the Project Gutenberg™ collection will
remain freely available for generations to come. In 2001, the Project
Gutenberg Literary Archive Foundation was created to provide a
secure and permanent future for Project Gutenberg™ and future
generations. To learn more about the Project Gutenberg Literary
Archive Foundation and how your efforts and donations can help,
see Sections 3 and 4 and the Foundation information page at
www.gutenberg.org.
Section 3. Information about the Project
Gutenberg Literary Archive Foundation
The Project Gutenberg Literary Archive Foundation is a non-profit
501(c)(3) educational corporation organized under the laws of the
state of Mississippi and granted tax exempt status by the Internal
Revenue Service. The Foundation’s EIN or federal tax identification
number is 64-6221541. Contributions to the Project Gutenberg
Literary Archive Foundation are tax deductible to the full extent
permitted by U.S. federal laws and your state’s laws.
The Foundation’s business office is located at 809 North 1500 West,
Salt Lake City, UT 84116, (801) 596-1887. Email contact links and up
to date contact information can be found at the Foundation’s website
and official page at www.gutenberg.org/contact
Section 4. Information about Donations to
the Project Gutenberg Literary Archive
Foundation
Project Gutenberg™ depends upon and cannot survive without
widespread public support and donations to carry out its mission of
increasing the number of public domain and licensed works that can
be freely distributed in machine-readable form accessible by the
widest array of equipment including outdated equipment. Many
small donations ($1 to $5,000) are particularly important to
maintaining tax exempt status with the IRS.
The Foundation is committed to complying with the laws regulating
charities and charitable donations in all 50 states of the United
States. Compliance requirements are not uniform and it takes a
considerable effort, much paperwork and many fees to meet and
keep up with these requirements. We do not solicit donations in
locations where we have not received written confirmation of
compliance. To SEND DONATIONS or determine the status of
compliance for any particular state visit www.gutenberg.org/donate.
While we cannot and do not solicit contributions from states where
we have not met the solicitation requirements, we know of no
prohibition against accepting unsolicited donations from donors in
such states who approach us with offers to donate.
International donations are gratefully accepted, but we cannot make
any statements concerning tax treatment of donations received from
outside the United States. U.S. laws alone swamp our small staff.
Please check the Project Gutenberg web pages for current donation
methods and addresses. Donations are accepted in a number of
other ways including checks, online payments and credit card
donations. To donate, please visit: www.gutenberg.org/donate.
Section 5. General Information About
Project Gutenberg™ electronic works
Professor Michael S. Hart was the originator of the Project
Gutenberg™ concept of a library of electronic works that could be
freely shared with anyone. For forty years, he produced and
distributed Project Gutenberg™ eBooks with only a loose network of
volunteer support.
Project Gutenberg™ eBooks are often created from several printed
editions, all of which are confirmed as not protected by copyright in
the U.S. unless a copyright notice is included. Thus, we do not
necessarily keep eBooks in compliance with any particular paper
edition.
Most people start at our website which has the main PG search
facility: www.gutenberg.org.
This website includes information about Project Gutenberg™,
including how to make donations to the Project Gutenberg Literary
Archive Foundation, how to help produce our new eBooks, and how
to subscribe to our email newsletter to hear about new eBooks.
Welcome to our website – the perfect destination for book lovers and
knowledge seekers. We believe that every book holds a new world,
offering opportunities for learning, discovery, and personal growth.
That’s why we are dedicated to bringing you a diverse collection of
books, ranging from classic literature and specialized publications to
self-development guides and children's books.
More than just a book-buying platform, we strive to be a bridge
connecting you with timeless cultural and intellectual values. With an
elegant, user-friendly interface and a smart search system, you can
quickly find the books that best suit your interests. Additionally,
our special promotions and home delivery services help you save time
and fully enjoy the joy of reading.
Join us on a journey of knowledge exploration, passion nurturing, and
personal growth every day!
ebookmasss.com