Advances and Applications of Optimised Algorithms in Image Processing 1st Edition Diego Oliva download
Advances and Applications of Optimised Algorithms in Image Processing 1st Edition Diego Oliva download
https://textbookfull.com/product/advances-and-applications-of-
optimised-algorithms-in-image-processing-1st-edition-diego-oliva/
https://textbookfull.com/product/metaheuristic-algorithms-for-
image-segmentation-theory-and-applications-diego-oliva/
https://textbookfull.com/product/image-processing-and-
communications-techniques-algorithms-and-applications-michal-
choras/
https://textbookfull.com/product/fractals-applications-in-
biological-signalling-and-image-processing-1st-edition-aliahmad/
https://textbookfull.com/product/advances-in-soft-computing-and-
machine-learning-in-image-processing-1st-edition-aboul-ella-
hassanien/
Advances in Metaheuristics Algorithms Methods and
Applications Erik Cuevas
https://textbookfull.com/product/advances-in-metaheuristics-
algorithms-methods-and-applications-erik-cuevas/
https://textbookfull.com/product/grouping-genetic-algorithms-
advances-and-applications-1st-edition-michael-mutingi/
https://textbookfull.com/product/image-operators-image-
processing-in-python-1st-edition-jason-m-kinser/
https://textbookfull.com/product/image-operators-image-
processing-in-python-first-edition-kinser/
https://textbookfull.com/product/modern-algorithms-for-image-
processing-computer-imagery-by-example-using-c-1st-edition-
vladimir-kovalevsky/
Intelligent Systems Reference Library 117
Diego Oliva
Erik Cuevas
Advances and
Applications
of Optimised
Algorithms in
Image Processing
Intelligent Systems Reference Library
Volume 117
Series editors
Janusz Kacprzyk, Polish Academy of Sciences, Warsaw, Poland
e-mail: kacprzyk@ibspan.waw.pl
Lakhmi C. Jain, University of Canberra, Canberra, Australia;
Bournemouth University, UK;
KES International, UK
e-mails: jainlc2002@yahoo.co.uk; Lakhmi.Jain@canberra.edu.au
URL: http://www.kesinternational.org/organisation.php
About this Series
The aim of this series is to publish a Reference Library, including novel advances
and developments in all aspects of Intelligent Systems in an easily accessible and
well structured form. The series includes reference works, handbooks, compendia,
textbooks, well-structured monographs, dictionaries, and encyclopedias. It contains
well integrated knowledge and current information in the field of Intelligent
Systems. The series covers the theory, applications, and design methods of
Intelligent Systems. Virtually all disciplines such as engineering, computer science,
avionics, business, e-commerce, environment, healthcare, physics and life science
are included.
123
Diego Oliva Erik Cuevas
Departamento de Electrónica, CUCEI Departamento de Electrónica, CUCEI
Universidad de Guadalajara Universidad de Guadalajara
Guadalajara, Jalisco Guadalajara, Jalisco
Mexico Mexico
and
This book brings together and explores possibilities for combining image
processing and artificial intelligence, both focused on machine learning and opti-
mization, two relevant areas and fields in computer science. Most books have been
proposed about the major topics separately, but not in conjunction, giving it a
special interest. The problems addressed and described in the different chapters
were selected in order to demonstrate the capabilities of optimization and machine
learning to solve different issues in image processing. These problems were selected
considering the degree of relevance in the field providing important cues on par-
ticular applications domains. The topics include the study of different methods for
image segmentation, and more specifically detection of geometrical shapes and
object recognition, where their applications in medical image processing, based on
the modification of optimization algorithms with machine learning techniques,
provide a new point of view. In short, the book was intended with the purpose and
motivation to show that optimization and machine learning main topics are
attractive alternatives for image processing technique taking advantage over other
existing strategies. Complex tasks can be addressed under these approaches pro-
viding new solutions or improving the existing ones thanks to the required foun-
dation for solving problems in specific areas and applications.
Unlike other existing books in similar areas, the book proposed here introduces
to the reader the new trends using optimization approaches about the use of opti-
mization and machine learning techniques applied to image processing. Moreover,
each chapter includes comparisons and updated references that support the results
obtained by the proposed approaches, at the same time that provides the reader a
practical guide to go to the reference sources.
The book was designed for graduate and postgraduate education, where students
can find support for reinforcing or as the basis for their consolidation or deepening
of knowledge, and for researchers. Also teachers can find support for the teaching
process in the areas involving machine vision or as examples related to main
techniques addressed. Additionally, professionals who want to learn and explore the
advances on concepts and implementation of optimization and learning-based
vii
viii Foreword
algorithms applied image processing find in this book an excellent guide for such
purpose.
The content of this book has been organized considering an introduction to
machine learning an optimization. After each chapter addresses and solves selected
problems in image processing. In this regard, Chaps. 1 and 2 provides respectively
introductions to machine learning and optimization, where the basic and main
concepts related to image processing are addressed. Chapter 3, describes the
electromagnetism-like optimization (EMO) algorithm, where the appropriate
modifications are addressed to work properly in image processing. Moreover, its
advantages and shortcomings are also explored. Chapter 4 addresses the digital
image segmentation as an optimization problem. It explains how the image seg-
mentation is treated as an optimization problem using different objective functions.
Template matching using a physical inspired algorithm is addressed in Chap. 5,
where indeed, template matching is considered as an optimization problem, based
on a modification of EMO and considering the use of a memory to reduce the
number of call functions. Chapter 6 addresses the detection of circular shapes
problem in digital images, and again focused as an optimization problem.
A practical medical application is proposed in Chap. 7, where blood cell seg-
mentation by circle detection is the problem to be solved. This chapter introduces a
new objective function to measure the match between the proposed solutions and
the blood cells contained in the images. Finally, Chap. 8 proposes an improvement
EMO applying the concept of opposition-based electromagnetism-like optimiza-
tion. This chapter analyzes a modification of EMO used as a machine learning
technique to improve its performance. An important advantage of this structure is
that each chapter could be read separately. Although all chapters are interconnected,
Chap. 3 serves as the basis for some of them.
The concise comprehensive book on the topics addressed makes this work an
important reference in image processing, which is an important area where a sig-
nificant number of technologies are continuously emerging and sometimes unten-
able and scattered along the literature. Therefore, congratulations to authors for
their diligence, oversight and dedication for assembling the topics addressed in the
book. The computer vision community will be very grateful for this well-done
work.
The use of cameras to obtain images or videos from the environment has been
extended in the last years. Now these sensors are present in our lives, from cell
phones to industrial, surveillance and medical applications. The tendency is to have
automatic applications that can analyze the images obtained with the cameras. Such
applications involve the use of image processing algorithms.
Image processing is a field in which the environment is analyzed using samples
taken with a camera. The idea is to extract features that permit the identification
of the objects contained in the image. To achieve this goal is necessary applying
different operators that allow a correct analysis of a scene. Most of these operations
are computationally expensive. On the other hand, optimization approaches are
extensively used in different areas of engineering. They are used to explore complex
search spaces and obtain the most appropriate solutions using an objective function.
This book presents a study the uses of optimization algorithms in complex prob-
lems of image processing. The selected problems explore areas from the theory of
image segmentation to the detection of complex objects in medical images. The
concepts of machine learning and optimization are analyzed to provide an overview
of the application of these tools in image processing.
The aim of this book is to present a study of the use of new tendencies to solve
image processing problems. When we start working on those topics almost ten
years ago, the related information was sparse. Now we realize that the researchers
were divided and closed in their fields. On the other hand, the use of cameras was
not popular then. This book presents in a practical way the task to adapt the
traditional methods of a specific field to be solved using modern optimization
algorithms. Moreover, in our study we notice that optimization algorithm could also
be modified and hybridized with machine learning techniques. Such modifications
are also included in some chapters. The reader could see that our goal is to show
that exist a natural link between the image processing and optimization. To achieve
this objective, the first three chapters introduce the concepts of machine learning,
optimization and the optimization technique used to solve the problems. The
structure of the rest of the sections is to first present an introduction to the problem
to be solved and explain the basic ideas and concepts about the implementations.
ix
x Preface
The book was planned considering that, the readers could be students, researchers
expert in the field and practitioners that are not completely involved with the topics.
This book has been structured so that each chapter can be read independently
from the others. Chapter 1 describes the machine learning (ML). This chapter
concentrates on elementary concepts of machine learning. Chapter 2 explains the
theory related with global optimization (GO). Readers that are familiar with those
topics may wish to skip these chapters.
In Chap. 3 the electromagnetism-like optimization (EMO) algorithm is intro-
duced as a tool to solve complex optimization problems. The theory of physics
behind the EMO operators is explained. Moreover, their pros and cons are widely
analyzed, including some of the most significant modifications.
Chapter 4 presents three alternative methodologies for image segmentation
considering different objective functions. The EMO algorithm is used to find the
best thresholds that can segment the histogram of a digital image.
In Chap. 5 the problem template matching is introduced that consists in the
detection of objects in an image using a template. Here the EMO algorithm opti-
mizes an objective function. Moreover, improvements to reduce the number of
evaluations and the convergence velocity are also explained.
Continuing with the object detection, Chap. 6 shows how EMO algorithm can be
applied to detect circular shapes embedded in digital images. Meanwhile, in
Chap. 7 a modified objective function is used to identify white blood cells in
medical images using EMO.
Chapter 8 shows how a machine learning technique could improve the perfor-
mance of an optimization algorithm without affecting its main features such as
accuracy or convergence.
Writing this book was a very rewarding experience where many people were
involved. We acknowledge Dr. Gonzalo Pajares for always being available to help
us. We express our gratitude to Prof. Lakhmi Jain, who so warmly sustained this
project. Acknowledgements also go to Dr. Thomas Ditzinger, who so kindly agreed
to its appearance.
Finally, it is necessary to mention that this book is a small piece in the puzzles of
image processing and optimization. We would like to encourage the reader to
explore and expand the knowledge in order create their own implementations
according their own necessities.
xi
xii Contents
1.1 Introduction
We already are in the era of big data. The overall amount of data is steadily
growing. There are about one trillion of web pages; one hour of video is uploaded
to YouTube every second, amounting to 10 years of content every day. Banks
handle more than 1 M transactions per hour and has databases containing more than
2.5 petabytes (2.5 × 1015) of information; and so on [1].
In general, we define machine learning as a set of methods that can automatically
detect patterns in data, and then use the uncovered patterns to predict future data, or
to perform other kinds of decision making under uncertainty. Learning means that
novel knowledge is generated from observations and that this knowledge is used to
achieve defined objectives. Data itself is already knowledge. But for certain
applications and for human understanding, large data sets cannot directly be applied
in their raw form. Learning from data means that new condensed knowledge is
extracted from the large amount of information [2].
Some typical machine learning problems include, for example in bioinformatics,
the analysis of large genome data sets to detect illnesses and for the development of
drugs. In economics, the study of large data sets of market data can improve the
behavior of decision makers. Prediction and inference can help to improve planning
strategies for efficient market behavior. The analysis of share markets and stock
time series can be used to learn models that allow the prediction of future devel-
opments. There are thousands of further examples that require the development of
efficient data mining and machine learning techniques. Machine learning tasks vary
in various kinds of ways, e.g., the type of learning task, the number of patterns, and
their size [2].
The Machine learning methods are usually divided into three main types: super-
vised, unsupervised and reinforcement learning [3]. In the predictive or supervised
learning approach, the goal is to learn a mapping from inputs x to outputs y, given a
labeled set of input-output pairs D ¼ fðxi ; yi ÞgNi¼1 ,xi ¼ x1i ; . . .; xdi . Here D is
called the training data set, and N represents the number of training examples.
In the simplest formulation, each training vector x is a d-dimensional vector,
where each dimension represents a feature or attribute of x. Similarly, yi symbolizes
the category assigned to xi . Such categories integrate a set defined as
yi 2 f1; . . .; C g. When yi is categorical, the problem is known as classification and
when yi is real-valued, the problem is known as regression. Figure 1.1 shows a
schematic representation of the supervised learning.
The second main method of machine learning is the unsupervised learning. In
unsupervised learning, it is only necessary to provide the data D ¼ fxi gNi¼1 .
Therefore, the objective of an unsupervised algorithm is to automatically find
patterns from the data, which are not initially apparent. This process is sometimes
called knowledge discovery. Under such conditions, this process is a much less
well-defined problem, since we are not told what kinds of patterns to look for, and
there is no obvious error metric to use (unlike supervised learning, where we can
compare our prediction of yi for a given xi to the observed value). Figure 1.2
illustrate the process of unsupervised learning. In the figure, data are automatically
classified according to their distances in two categories, such as clustering
algorithms.
Reinforcement Learning is the third method of machine learning. It is less
popular compared with supervised and unsupervised methods. Under,
Reinforcement learning, an agent learns to behave in an unknown scenario through
the signals of reward and punishment provided by a critic. Different to supervised
learning, the reward and punishment signals give less information, in most of the
cases only failure or success. The final objective of the agent is to maximize the
total reward obtained in a complete learning episode. Figure 1.3 illustrate the
process of reinforcement learning.
-
Error + Desired
yi
signal output
1.3 Classification 3
Reward/punishment
Actions
State
Agent
Values
Critic
1.3 Classification
The Nearest neighbor (NN) method is the most popular method used in machine
learning for classification. Its best characteristic is its simplicity. It is based on the
idea that the closest patterns to a target pattern x0 , for which we seek the label,
deliver useful information of its description. Based on this idea, NN assigns the
class label of the majority of the k-nearest patterns in data space. Figure 1.4 show
the classification process under the NN method, considering a 4-nearest approach.
Analyzing Fig. 1.4, it is clear that the novel pattern x0 will be classified as element
of the class A, since most of the nearest element are of the A category.
2
3
1 4
x
1.5 Overfitting 5
(a) (b)
Input Actual Input Actual
System Output System Output
+ +
- Target - Target
Output Output
Fig. 1.5 Graphical representation of the learning process in Parametric and non-parametric
models
1.5 Overfitting
The objective of learning is to obtain better predictions as outputs, being they class
labels or continuous regression values. The process to know how successfully the
algorithm has learnt is to compare the actual predictions with known target labels,
which in fact is how the training is done in supervised learning. If we want to
generalize the performance of the learning algorithm to examples that were not seen
during the training process, we obviously can’t test by using the same data set used
in the learning stage. Therefore, it is necessary a different data, a test set, to prove
the generalization ability of the learning method. This test set is used by the
learning algorithm and compared with the predicted outputs produced during the
learning process. In this test, the parameters obtained in the learning process are not
modified.
In fact, during the learning process, there is at least as much danger in
over-training as there is in under-training. The number of degrees of variability in
most machine learning algorithms is huge—for a neural network there are lots of
weights, and each of them can vary. This is undoubtedly more variation than there
is in the function we are learning, so we need to be careful: if we train for too long,
then we will overfit the data, which means that we have learnt about the noise and
inaccuracies in the data as well as the actual function. Therefore, the model that we
learn will be much too complicated, and won’t be able to generalize.
Figure 1.6 illustrates this problem by plotting the predictions of some algorithm
(as the curve) at two different points in the learning process. On the Fig. 1.6a the
curve fits the overall trend of the data well (it has generalized to the underlying
general function), but the training error would still not be that close to zero since it
passes near, but not through, the training data. As the network continues to learn, it
will eventually produce a much more complex model that has a lower training error
(close to zero), meaning that it has memorized the training examples, including any
noise component of them, so that is has overfitted the training data (see Fig. 1.6b).
6 1 An Introduction to Machine Learning
(a) (b)
f(x) g(x)
x x
We want to stop the learning process before the algorithm overfits, which means
that we need to know how well it is generalizing at each iteration. We can’t use the
training data for this, because we wouldn’t detect overfitting, but we can’t use the
testing data either, because we’re saving that for the final tests. So we need a third
set of data to use for this purpose, which is called the validation set because we’re
using it to validate the learning so far. This is known as cross-validation in statistics.
It is part of model selection: choosing the right parameters for the model so that it
generalizes as well as possible.
The NN classifier is simple and can work quite well, when it is given a represen-
tative distance metric and an enough training data. In fact, it can be shown that the
NN classifier can come within a factor of 2 of the best possible performance if
N ! 1.
However, the main problem with NN classifiers is that they do not work well
with high dimensional data x. The poor performance in high dimensional settings is
due to the curse of dimensionality.
To explain the curse, we give a simple example. Consider applying a NN
classifier to data where the inputs are uniformly distributed in the d-dimensional
unit cube. Suppose we estimate the density of class labels around a test point x0 by
“growing” a hyper-cube around x0 until it contains a desired fraction F of the data
points. The expected edge length of this cube will be ed ðFÞ ¼ F 1=d . If d = 10 and
we want to compute our estimate on 10 % of the data, we have e10 ð0:1Þ ¼ 0:8, so
we need to extend the cube 80 % along each dimension around x0 . Even if we only
use 1 % of the data, we find e10 ð0:01Þ ¼ 0:63, see Fig. 1.7. Since the entire range
of the data is only 1 along each dimension, we see that the method is no longer very
local, despite the name “nearest neighbor”. The trouble with looking at neighbors
that are so far away is that they may not be good predictors about the behavior of
the input-output function at a given point.
1.7 Bias-Variance Trade-Off 7
(b)
1
(a) 0.8
Fig. 1.7 Illustration of the curse of dimensionality. a We embed a small cube of side s inside a
larger unit cube. b We plot the edge length of a cube needed to cover a given volume of the unit
cube as a function of the number of dimensions
error
expected error
variance bias
flexibility
Figure 1.8 illustrates the bias-variance trade-off. On the x-axis the model com-
plexity increases from left to right. While a method with low flexibility has a low
variance, it usually suffers from high bias. The variance increases while the bias
decreases with increasing model flexibility. The effect changes in the middle of the
plot, where variance and bias cross. The expected error is minimal in the middle of
the plot, where bias and variance reach a similar level. For practical problems and
data sets, the bias-variance trade-off has to be considered when the decision for a
particular method is made.
Figure 1.9 shows the measurements of a feature x for two different classes, C1 and
C2 . Members of class C2 tend to have larger values of feature x than members of
class C1 , but there is some overlap between both classes. Under such conditions, the
correct class is easy to predict at the extremes of the range of each class, but what to
do in the middle where is unclear [3].
200
p(x)
150
100
50
0
30 40 50 60 70 80 90
x
1.8 Data into Probabilities 9
Assuming that we are trying to classify the writing letters ‘a’ and ‘b’ based on
their height (as it is shown in Fig. 1.10). Most of the people write the letter ‘a’
smaller than their ‘b’, but not everybody. However, in this example, other class of
information can be used to solve this classification problem. We know that in
English texts, the letter ‘a’ is much more common than the letter ‘b’. If we see a
letter that is either an ‘a’ or a ‘b’ in normal writing, then there is a 75 % chance that
it is an ‘a.’ We are using prior knowledge to estimate the probability that the letter is
an ‘a’: in this example, pðC1 Þ ¼ 0:75, pðC2 Þ ¼ 0:25. If we weren’t allowed to see
the letter at all, and just had to classify it, then if we picked ‘a’ every time, we’d be
right 75 % of the time.
In order to give a prediction, it is necessary to know the value x of the dis-
criminant feature. It would be a mistake to use only the occurrence (a priori)
probabilities pðC1 Þ and pðC2 Þ. Normally, a classification problem is formulated
through the definition of a data set which contains a set of values of x and the class
of each exemplar. Under such conditions, it is easy to calculate the value of pðC1 Þ
(we just count how many times out of the total the class was C1 and divide by the
total number of examples), and also another useful measurement: the conditional
probability of C1 given that x has value X: pðC1 jXÞ. The conditional probability
tells us how likely it is that the class is C1 given that the value of x is X. So in
Fig. 1.9 the value of pðC1 jXÞ will be much larger for small values of X than for
large values. Clearly, this is exactly what we want to calculate in order to perform
classification. The question is how to get to this conditional probability, since we
can’t read it directly from the histogram. The first thing that we need to do to get
these values is to quantize the measurement x, which just means that we put it into
one of a discrete set of values {X}, such as the bins in a histogram. This is exactly
what is plotted in Fig. 1.8. Now, if we have lots of examples of the two classes, and
the histogram bins that their measurements fall into, we can compute pðCi ; Xj Þ,
which is the joint probability, and tells us how often a measurement of Ci fell into
histogram bin Xj . We do this by looking in histogram bin Xj , counting the number
of elements of Ci , and dividing by the total number of examples of any class.
We can also define pðXj Ci Þ, which is a different conditional probability, and
tells us how often (in the training set) there is a measurement of Xj given that the
example is a member of class Ci . Again, we can just get this information from the
or equivalently:
pðCi ; Xj Þ ¼ pðCi Xj Þ pðXj Þ ð1:3Þ
Clearly, the right-hand side of these two equations must be equal to each other,
since they are both equal to pðCi ; Xj Þ, and so with one division we can write:
This is Bayes’ rule. If you don’t already know it, learn it: it is the most important
equation in machine learning. It relates the posterior probability pðCi jXj Þ with the
prior probability pðC1 Þ and class-conditional probability pðXj Ci Þ The denominator
(the term on the bottom of the fraction) acts to normalize everything, so that all the
probabilities sum to 1. It might not be clear how to compute this term. However, if
we notice that any observation Xk has to belong to some class Ci , then we can
marginalize over the classes to compute:
X
pðXk Þ ¼ pðXk jCi Þ PðCi Þ ð1:5Þ
i
The reason why Bayes’ rule is so important is that it lets us obtain the posterior
probability—which is what we actually want—by calculating things that are much
easier to compute. We can estimate the prior probabilities by looking at how often
each class appears in our training set, and we can get the class-conditional prob-
abilities from the histogram of the values of the feature for the training set. We can
use the posterior probability to assign each new observation to one of the classes by
picking the class Ci where:
where x is a vector of feature values instead of just one feature. This is known as the
maximum a posteriori or MAP hypothesis, and it gives us a way to choose which
class to choose as the output one. The question is whether this is the right thing to
1.8 Data into Probabilities 11
do. There has been quite a lot of research in both the statistical and machine
learning literatures into what is the right question to ask about our data to perform
classification, but we are going to skate over it very lightly.
The MAP question is; what is the most likely class given the training data?
Suppose that there are three possible output classes, and for a particular input the
posterior probabilities of the classes are pðC1 jxÞ ¼ 0:35, pðC2 jxÞ ¼ 0:45,
pðC3 jxÞ ¼ 0:2. The MAP hypothesis therefore tells us that this input is in class C2 ,
because that is the class with the highest posterior probability. Now suppose that,
based on the class that the data is in, we want to do something. If the class is C1 or
C3 then we do action 1, and if the class is C2 then we do action 2. As an example,
suppose that the inputs are the results of a blood test, the three classes are different
possible diseases, and the output is whether or not to treat with a particular
antibiotic. The MAP method has told us that the output is C2 .
As an example, suppose that the inputs are the results of a blood test, the three
classes are different possible diseases, and the output is whether or not to treat with
a particular antibiotic. The MAP method has told us that the output is C2 , and so we
will not treat the disease. But what is the probability that it does not belong to class
C2 , and so should have been treated with the antibiotic? It is 1 pðC2 Þ ¼ 0:55. So
the MAP prediction seems to be wrong: we should treat with antibiotic, because
overall it is more likely. This method where we take into account the final outcomes
of all of the classes is called the Bayes’ Optimal Classification. It minimizes the
probability of misclassification, rather than maximizing the posterior probability.
References
1. Kramer, O.: Machine Learning for Evolution Strategy. Springer, Switzerland (2016)
2. Kevin, P.: Murphy, Machine Learning: A Probabilistic Perspective. MIT Press, London (2011)
3. Marsland, S.: Machine Learning, An Algorithm Perspective. CRC Press, Boca Raton (2015)
4. Mola, A., Vishwanathan, A.: Introduction to Machine Learning. Cambridge University Press,
Cambridge (2008)
Chapter 2
Optimization
The vast majority of image processing and pattern recognition algorithms use some
form of optimization, as they intend to find some solution which is “best” according
to some criterion. From a general perspective, an optimization problem is a situation
that requires to decide for a choice from a set of possible alternatives in order to
reach a predefined/required benefit at minimal costs [1].
Consider a public transportation system of a city, for example. Here the system
has to find the “best” route to a destination location. In order to rate alternative
solutions and eventually find out which solution is “best,” a suitable criterion has to
be applied. A reasonable criterion could be the distance of the routes. We then
would expect the optimization algorithm to select the route of shortest distance as a
solution. Observe, however, that other criteria are possible, which might lead to
different “optimal” solutions, e.g., number of transfers, ticket price or the time it
takes to travel the route leading to the fastest route as a solution.
Mathematically speaking, optimization can be described as follows: Given a
function f : S ! R which is called the objective function, find the argument which
minimizes f:
S defines the so-called solution set, which is the set of all possible solutions for the
optimization problem. Sometimes, the unknown(s) x are referred to design vari-
ables. The function f describes the optimization criterion, i.e., enables us to cal-
culate a quantity which indicates the “quality” of a particular x.
In our example, S is composed by the subway trajectories and bus lines, etc.,
stored in the database of the system, x is the route the system has to find, and the
optimization criterion f(x) (which measures the quality of a possible solution) could
calculate the ticket price or distance to the destination (or a combination of both),
depending on our preferences.
Sometimes there also exist one or more additional constraints which the solution
x has to satisfy. In that case we talk about constrained optimization (opposed to
unconstrained optimization if no such constraint exists). As a summary, an opti-
mization problem has the following components:
• One or more design variables x for which a solution has to be found
• An objective function f(x) describing the optimization criterion
• A solution set S specifying the set of possible solutions x
• (optional) One or more constraints on x
In order to be of practical use, an optimization algorithm has to find a solution in
a reasonable amount of time with reasonable accuracy. Apart from the performance
of the algorithm employed, this also depends on the problem at hand itself. If we
can hope for a numerical solution, we say that the problem is well-posed. For
assessing whether an optimization problem is well-posed, the following conditions
must be fulfilled:
1. A solution exists.
2. There is only one solution to the problem, i.e., the solution is unique.
3. The relationship between the solution and the initial conditions is such that
small perturbations of the initial conditions result in only small variations of x .
Once a task has been transformed into an objective function minimization problem,
the next step is to choose an appropriate optimizer. Optimization algorithms can be
divided in two groups: derivative-based and derivative-free [2].
In general, f(x) may have a nonlinear form respect to the adjustable parameter
x. Due to the complexity of f ðÞ, in classical methods, it is often used an iterative
algorithm to explore the input space effectively. In iterative descent methods, the
next point xk þ 1 is determined by a step down from the current point xk in a
direction vector d:
xk þ 1 ¼ xk þ ad; ð2:2Þ
where a is a positive step size regulating to what extent to proceed in that direction.
When the direction d in Eq. 2.1 is determined on the basis of the gradient (g) of the
objective function f ðÞ, such methods are known as gradient-based techniques.
The method of steepest descent is one of the oldest techniques for optimizing a
given function. This technique represents the basis for many derivative-based
methods. Under such a method, Eq. 2.3 becomes the well-known gradient formula:
2.2 Classical Optimization 15
10
9.8
f (x1 ,x 2 )
9.6
9.4
9.2
1
9 0.5
1
0.5 0
0 −0.5 x1
x2 −0.5
−1
−1
10
8
f (x1 ,x 2 )
0 1
1 0.5
0.5
0
0
x2 −0.5 −0.5 x1
−1 −1
x21 þ x22 x2ffiffi
minimize f ðx1 ; x2 Þ ¼ 4000 cosðx1 Þ cos p
2
þ1
subject to 30 x1 30 ð2:6Þ
30 x2 30
x2
x1
[4] Sans doute Copernic qui termina vers 1530 son livre De orbium
cœlestium revolutionibus, imprimé, en 1543, à Nuremberg, avec une
dédicace au pape Paul III. Dès 1540, une lettre de son disciple Rheticus fit
connaître le nouveau système.
[6] Il semble qu'on retrouve ces tristes pensées dans le beau portrait de
Luther mort, qui se trouve dans la collection du libraire Zimmer à Heidelberg;
ce portrait exprime aussi la continuation d'un long effort.
[7] Nom d'un village près duquel Luther possédait une petite terre.
[11] Mélanchton fait remarquer que saint Augustin n'exprime pas cette
opinion dans ses écrits de controverse.
RENVOIS
DU TOME TROISIÈME.
Livre III.—1529-1546 1
er
Chap. 1 . 1529-1532. Les Turcs.—Danger de l'Allemagne.—
Augsbourg, Smalkalde.—Danger du protestantisme. 1
Chap. II. 1534-1536. Anabaptistes de Münster. 28
Chap. III. 1536-1545. Dernières années de la vie de Luther.
—Polygamie du landgrave de Hesse, etc. 56
Livre IV.—1530-1546. 71
Chap. 1er. Conversations de Luther.—La famille, la femme,
les enfans.—La nature. 71
Chap. II. La Bible.—Les Pères.—Les scolastiques.—Le pape.
Les conciles. 85
Chap. III. Des écoles et universités et des arts libéraux. 100
Chap. IV. Drames.—Musique.—Astrologie.—Imprimerie.—
Banque, etc. 114
Chap. V. De la prédication.—Style de Luther.—Il avoue la
violence de son caractère. 123
Livre V.—Chap. 1er. Mort du père de Luther, de sa fille, etc. 131
Chap. II. De l'équité, de la Loi.—Opposition du théologien et
du juriste. 138
Chap. III. La foi; la loi. 144
Chap. IV. Des novateurs.—Mystiques, etc. 152
Chap. V. Tentations.—Regrets et doutes des amis, de la
femme; doutes de Luther lui-même. 163
Chap. VI. Le diable.—Tentations. 168
Chap. VII. Maladies.—Désir de la mort et du jugement.—
Mort, 1546. 200
Additions et Éclaircissemens. 223
Notes. 352
Renvois. 353
Au lecteur.
Ce livre électronique reproduit intégralement le texte original. Les erreurs
signalées par l'auteur (voir Errata) ont été corrigées. Elles sont indiquées par
(Err.) Quelques erreurs typographiques ont également été corrigées; la liste
de ces corrections se trouve ci-dessous. La ponctuation a été tacitement
corrigée par endroits.
Les notes de bas de page ont été renumérotées de 1 à 11 et regroupées à
la fin du livre. Les «Additions et éclaircissemens» ont été numérotés de a1 à
a79. Les «Renvois» ont été numérotés de r1 à r225. Additions et renvois ont
été signalés dans le texte.
Corrections.
Pages 3, 353, 355: «Cochlœus» remplacé par «Cochlæus».
Page 28: «compagnonage» remplacé par «compagnonnage» (Le mystique
compagnonnage allemand).
Page 36: «dor» par «d'or» (trente et un chevaux couverts de draps d'or).
Page 37: «cent» par «cents» (près de quatre mille deux cents).
Page 75: «de de» par «de» (Ne vous scandalisez pas de me voir).
Page 139: «barette» par «barrette» (doit ôter sa barrette devant la
théologie).
Page 209: «rassassié» remplacé par «rassasié» (On est rassasié de la parole
de Dieu).
Page 222: «sufffire» par «suffire» (que nous ayons pu y suffire).
Page 258: «deux» par «d'eux» (Que l'un d'eux avait commis un meurtre).
Page 315: «pomptement» par «promptement» (il exécute promptement).
Page 339: «Brandbourg» par «Brandebourg» (récemment introduite dans le
Brandebourg).
Page 340: «tintamare» par «tintamarre» (avec chant et tintamarre).
Page 353 «RENVOIS DU TOME TROISIÈME»: il faut sans doute lire «RENVOIS
DU TOME DEUXIÈME».
Page 360 (renvoi nº 160): ajouté «_Ibid._»
Page 361 (renvoi nº 176): au lieu de «Il sera si mauvais» il faut sans doute
lire «Il fera si mauvais»; ajouté «_Ibid._»
Page 366 Table des matières: au lieu de «TROISIÈME VOLUME» et «TOME
TROISIÈME» il faut sans doute lire «DEUXIÈME VOLUME» et «TOME
DEUXIÈME».
*** END OF THE PROJECT GUTENBERG EBOOK MÉMOIRES DE
LUTHER ÉCRITS PAR LUI-MÊME, TOME II ***
1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside
the United States, check the laws of your country in addition to
the terms of this agreement before downloading, copying,
displaying, performing, distributing or creating derivative works
based on this work or any other Project Gutenberg™ work. The
Foundation makes no representations concerning the copyright
status of any work in any country other than the United States.
1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if
you provide access to or distribute copies of a Project
Gutenberg™ work in a format other than “Plain Vanilla ASCII” or
other format used in the official version posted on the official
Project Gutenberg™ website (www.gutenberg.org), you must,
at no additional cost, fee or expense to the user, provide a copy,
a means of exporting a copy, or a means of obtaining a copy
upon request, of the work in its original “Plain Vanilla ASCII” or
other form. Any alternate format must include the full Project
Gutenberg™ License as specified in paragraph 1.E.1.
• You pay a royalty fee of 20% of the gross profits you derive
from the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”
• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.
1.F.
Most people start at our website which has the main PG search
facility: www.gutenberg.org.
Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.
textbookfull.com