100% found this document useful (1 vote)
30 views

Advanced engineering analysis the calculus of variations and functional analysis with applications in mechanics 1st Edition Lebedev - Quickly download the ebook to explore the full content

The document promotes instant access to various engineering and mathematical ebooks available for download on ebookgate.com. It highlights titles such as 'Advanced Engineering Analysis' and 'Fractional Calculus with Applications in Mechanics,' among others. The text emphasizes the importance of understanding modern mathematical methods for engineers in order to effectively communicate with mathematicians and utilize advanced computational tools.

Uploaded by

gultonleyden
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
30 views

Advanced engineering analysis the calculus of variations and functional analysis with applications in mechanics 1st Edition Lebedev - Quickly download the ebook to explore the full content

The document promotes instant access to various engineering and mathematical ebooks available for download on ebookgate.com. It highlights titles such as 'Advanced Engineering Analysis' and 'Fractional Calculus with Applications in Mechanics,' among others. The text emphasizes the importance of understanding modern mathematical methods for engineers in order to effectively communicate with mathematicians and utilize advanced computational tools.

Uploaded by

gultonleyden
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 75

Instant Ebook Access, One Click Away – Begin at ebookgate.

com

Advanced engineering analysis the calculus of


variations and functional analysis with
applications in mechanics 1st Edition Lebedev

https://ebookgate.com/product/advanced-engineering-analysis-
the-calculus-of-variations-and-functional-analysis-with-
applications-in-mechanics-1st-edition-lebedev/

OR CLICK BUTTON

DOWLOAD EBOOK

Get Instant Ebook Downloads – Browse at https://ebookgate.com


Click here to visit ebookgate.com and download ebook now
Instant digital products (PDF, ePub, MOBI) available
Download now and explore formats that suit you...

Advanced calculus with applications in statistics 2ed


Edition Khuri A.I.

https://ebookgate.com/product/advanced-calculus-with-applications-in-
statistics-2ed-edition-khuri-a-i/

ebookgate.com

Fractional Calculus with Applications in Mechanics


Vibrations and Diffusion Processes 1st Edition T.
Atanackovic
https://ebookgate.com/product/fractional-calculus-with-applications-
in-mechanics-vibrations-and-diffusion-processes-1st-edition-t-
atanackovic/
ebookgate.com

Advanced Calculus An Introduction to Linear Analysis 1st


Edition Leonard F. Richardson

https://ebookgate.com/product/advanced-calculus-an-introduction-to-
linear-analysis-1st-edition-leonard-f-richardson/

ebookgate.com

The Calculus of Variations Universitext 2004th Edition


Brunt

https://ebookgate.com/product/the-calculus-of-variations-
universitext-2004th-edition-brunt/

ebookgate.com
Some applications of functional analysis in mathematical
physics 3rd ed Edition S. L. Sobolev

https://ebookgate.com/product/some-applications-of-functional-
analysis-in-mathematical-physics-3rd-ed-edition-s-l-sobolev/

ebookgate.com

Finite Element Analysis Applications in Mechanical


Engineering 2012 Farzad Ebrahimi

https://ebookgate.com/product/finite-element-analysis-applications-in-
mechanical-engineering-2012-farzad-ebrahimi/

ebookgate.com

Differential equation analysis in biomedical science and


engineering ordinary differential equation applications
with R 1st Edition William E. Schiesser
https://ebookgate.com/product/differential-equation-analysis-in-
biomedical-science-and-engineering-ordinary-differential-equation-
applications-with-r-1st-edition-william-e-schiesser/
ebookgate.com

The Functional Analysis of English 3rd Edition Thomas


Bloor

https://ebookgate.com/product/the-functional-analysis-of-english-3rd-
edition-thomas-bloor/

ebookgate.com

Principles of Functional Analysis Second Edition Martin


Schechter

https://ebookgate.com/product/principles-of-functional-analysis-
second-edition-martin-schechter/

ebookgate.com
Advanced
Engineering
Analysis
The Calculus of Variations
and Functional Analysis with
Applications in Mechanics
This page intentionally left blank
Advanced
Engineering
Analysis
The Calculus of Variations
and Functional Analysis with
Applications in Mechanics

Leonid P. Lebedev
Department of Mathematics,
National University of Colombia, Colombia

Michael J. Cloud
Department of Electrical and Computer Engineering,
Lawrence Technological University, USA

Victor A. Eremeyev
Institute of Mechanics, Otto von Guericke University Magdeburg, Germany
South Scientific Center of RASci
and South Federal University, Rostov on Don, Russia

World Scientific
NEW JERSEY • LONDON • SINGAPORE • BEIJING • SHANGHAI • HONG KONG • TA I P E I • CHENNAI
Published by
World Scientific Publishing Co. Pte. Ltd.
5 Toh Tuck Link, Singapore 596224
USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601
UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE

British Library Cataloguing-in-Publication Data


A catalogue record for this book is available from the British Library.

ADVANCED ENGINEERING ANALYSIS


The Calculus of Variations and Functional Analysis with
Applications in Mechanics
Copyright © 2012 by World Scientific Publishing Co. Pte. Ltd.
All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means,
electronic or mechanical, including photocopying, recording or any information storage and retrieval
system now known or to be invented, without written permission from the Publisher.

For photocopying of material in this volume, please pay a copying fee through the Copyright
Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to
photocopy is not required from the publisher.

Desk Editor: Tjan Kwang Wei

ISBN-13 978-981-4390-47-7
ISBN-10 981-4390-47-X

Printed in Singapore.

KwangWei - Advanced Engineering Analysis.pmd


1 2/14/2012, 11:29 AM
September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

Preface

A little over half a century ago, it was said that even an ingenious per-
son could not be an engineer unless he had nearly perfect skills with the
logarithmic slide rule. The advent of the computer changed this situa-
tion crucially; at present, many young engineers have never heard of the
slide rule. The computer has profoundly changed the mathematical side
of the engineering profession. Symbolic manipulation programs can cal-
culate integrals and solve ordinary differential equations better and faster
than professional mathematicians can. Computers also provide solutions
to differential equations in numerical form. The easy availability of mod-
ern graphics packages means that many engineers prefer such approximate
solutions even when exact analytical solutions are available.
Because engineering courses must provide an understanding of the fun-
damentals, they continue to focus on simple equations and formulas that
are easy to explain and understand. Moreover, it is still true that stu-
dents must develop some analytical abilities. But the practicing engineer,
armed with a powerful computer and sophisticated canned programs, em-
ploys models of processes and objects that are mathematically well beyond
the traditional engineering background. The mathematical methods used
by engineers have become quite sophisticated. With insufficient base knowl-
edge to understand these methods, engineers may come to believe that the
computer is capable of solving any problem. Worse yet, they may decide
to accept nearly any formal result provided by a computer as long as it was
generated by a program of a known trademark.
But mathematical methods are restricted. Certain problems may ap-
pear to fall within the nominal solution capabilities of a computer program
and yet lie well beyond those capabilities. Nowadays, the properties of so-
phisticated models and numerical methods are explained using terminology

v
September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

vi Advanced Engineering Analysis

from functional analysis and the modern theory of differential equations.


Without understanding terms such as “weak solution” and “Sobolev space”,
one cannot grasp a modern convergence proof or follow a rigorous discus-
sion of the restrictions placed on a mathematical model. Unfortunately, the
mathematical portion of the engineering curriculum remains preoccupied
with 19th century topics, even omitting the calculus of variations and other
classical subjects. It is, nevertheless, increasingly more important for the
engineer to understand the theoretical underpinning of his instrumentation
than to have an ability to calculate integrals or generate series solutions of
differential equations.
The present text offers rigorous insight and will enable an engineer to
communicate effectively with the mathematicians who develop models and
methods for machine computation. It should prove useful to those who
wish to employ modern mathematical methods with some depth of under-
standing.
The book constitutes a substantial revision and extension of the earlier
book The Calculus of Variations and Functional Analysis, written by the
first two authors. A new chapter (Chapter 2) provides applications of the
calculus of variations to nonstandard problems in mechanics. Numerous
exercises (most with extensive hints) have been added throughout.
The numbering system is as follows. All definitions, theorems, corol-
laries, lemmas, remarks, conventions, and examples are numbered consecu-
tively by chapter (thus Definition 1.7 is followed by Lemma 1.8). Equations
are numbered independently, again by chapter.
We would like to thank our World Scientific editor, Mr. Yeow-Hwa Quek.

Leonid P. Lebedev
Department of Mathematics, National University of Colombia, Colombia

Michael J. Cloud
Department of Electrical and Computer Engineering, Lawrence Technolog-
ical University, USA

Victor A. Eremeyev
Institute of Mechanics, Otto von Guericke University Magdeburg, Germany
South Scientific Center of RASci and South Federal University, Rostov on
Don, Russia
September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

Contents

Preface v

1. Basic Calculus of Variations 1


1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Euler’s Equation for the Simplest Problem . . . . . . . . . 15
1.3 Properties of Extremals of the Simplest Functional . . . . 21
1.4 Ritz’s Method . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.5 Natural Boundary Conditions . . . . . . . . . . . . . . . . 31
1.6 Extensions to More General Functionals . . . . . . . . . . 34
1.7 Functionals Depending on Functions in Many Variables . 43
1.8 A Functional with Integrand Depending on Partial Deriva-
tives of Higher Order . . . . . . . . . . . . . . . . . . . . . 49
1.9 The First Variation . . . . . . . . . . . . . . . . . . . . . . 54
1.10 Isoperimetric Problems . . . . . . . . . . . . . . . . . . . . 65
1.11 General Form of the First Variation . . . . . . . . . . . . 72
1.12 Movable Ends of Extremals . . . . . . . . . . . . . . . . . 76
1.13 Broken Extremals: Weierstrass–Erdmann Conditions and
Related Problems . . . . . . . . . . . . . . . . . . . . . . . 80
1.14 Sufficient Conditions for Minimum . . . . . . . . . . . . . 85
1.15 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

2. Applications of the Calculus of Variations in Mechanics 99


2.1 Elementary Problems for Elastic Structures . . . . . . . . 99
2.2 Some Extremal Principles of Mechanics . . . . . . . . . . 108
2.3 Conservation Laws . . . . . . . . . . . . . . . . . . . . . . 127
2.4 Conservation Laws and Noether’s Theorem . . . . . . . . 131

vii
September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

viii Advanced Engineering Analysis

2.5 Functionals Depending on Higher Derivatives of y . . . . 139


2.6 Noether’s Theorem, General Case . . . . . . . . . . . . . . 143
2.7 Generalizations . . . . . . . . . . . . . . . . . . . . . . . . 147
2.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

3. Elements of Optimal Control Theory 159


3.1 A Variational Problem as an Optimal Control Problem . . 159
3.2 General Problem of Optimal Control . . . . . . . . . . . . 161
3.3 Simplest Problem of Optimal Control . . . . . . . . . . . 164
3.4 Fundamental Solution of a Linear Ordinary Differential
Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
3.5 The Simplest Problem, Continued . . . . . . . . . . . . . 171
3.6 Pontryagin’s Maximum Principle for the Simplest Problem 173
3.7 Some Mathematical Preliminaries . . . . . . . . . . . . . . 177
3.8 General Terminal Control Problem . . . . . . . . . . . . . 189
3.9 Pontryagin’s Maximum Principle for the Terminal Optimal
Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
3.10 Generalization of the Terminal Control Problem . . . . . 198
3.11 Small Variations of Control Function for Terminal Control
Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
3.12 A Discrete Version of Small Variations of Control Function
for Generalized Terminal Control Problem . . . . . . . . . 205
3.13 Optimal Time Control Problems . . . . . . . . . . . . . . 208
3.14 Final Remarks on Control Problems . . . . . . . . . . . . 212
3.15 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 214

4. Functional Analysis 215


4.1 A Normed Space as a Metric Space . . . . . . . . . . . . . 217
4.2 Dimension of a Linear Space and Separability . . . . . . . 223
4.3 Cauchy Sequences and Banach Spaces . . . . . . . . . . . 227
4.4 The Completion Theorem . . . . . . . . . . . . . . . . . . 238
4.5 Lp Spaces and the Lebesgue Integral . . . . . . . . . . . . 242
4.6 Sobolev Spaces . . . . . . . . . . . . . . . . . . . . . . . . 248
4.7 Compactness . . . . . . . . . . . . . . . . . . . . . . . . . 250
4.8 Inner Product Spaces, Hilbert Spaces . . . . . . . . . . . . 260
4.9 Operators and Functionals . . . . . . . . . . . . . . . . . . 264
4.10 Contraction Mapping Principle . . . . . . . . . . . . . . . 269
4.11 Some Approximation Theory . . . . . . . . . . . . . . . . 276
September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

Contents ix

4.12 Orthogonal Decomposition of a Hilbert Space and the


Riesz Representation Theorem . . . . . . . . . . . . . . . 280
4.13 Basis, Gram–Schmidt Procedure, and Fourier Series in
Hilbert Space . . . . . . . . . . . . . . . . . . . . . . . . . 284
4.14 Weak Convergence . . . . . . . . . . . . . . . . . . . . . . 291
4.15 Adjoint and Self-Adjoint Operators . . . . . . . . . . . . . 298
4.16 Compact Operators . . . . . . . . . . . . . . . . . . . . . 304
4.17 Closed Operators . . . . . . . . . . . . . . . . . . . . . . . 311
4.18 On the Sobolev Imbedding Theorem . . . . . . . . . . . . 315
4.19 Some Energy Spaces in Mechanics . . . . . . . . . . . . . 320
4.20 Introduction to Spectral Concepts . . . . . . . . . . . . . 337
4.21 The Fredholm Theory in Hilbert Spaces . . . . . . . . . . 343
4.22 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 352

5. Applications of Functional Analysis in Mechanics 359


5.1 Some Mechanics Problems from the Standpoint of the Cal-
culus of Variations; the Virtual Work Principle . . . . . . 359
5.2 Generalized Solution of the Equilibrium Problem for a
Clamped Rod with Springs . . . . . . . . . . . . . . . . . 364
5.3 Equilibrium Problem for a Clamped Membrane and its
Generalized Solution . . . . . . . . . . . . . . . . . . . . . 367
5.4 Equilibrium of a Free Membrane . . . . . . . . . . . . . . 369
5.5 Some Other Equilibrium Problems of Linear Mechanics . 371
5.6 The Ritz and Bubnov–Galerkin Methods . . . . . . . . . . 379
5.7 The Hamilton–Ostrogradski Principle and Generalized
Setup of Dynamical Problems in Classical Mechanics . . . 381
5.8 Generalized Setup of Dynamic Problem for Membrane . . 383
5.9 Other Dynamic Problems of Linear Mechanics . . . . . . 397
5.10 The Fourier Method . . . . . . . . . . . . . . . . . . . . . 399
5.11 An Eigenfrequency Boundary Value Problem Arising in
Linear Mechanics . . . . . . . . . . . . . . . . . . . . . . . 400
5.12 The Spectral Theorem . . . . . . . . . . . . . . . . . . . . 404
5.13 The Fourier Method, Continued . . . . . . . . . . . . . . . 410
5.14 Equilibrium of a von Kármán Plate . . . . . . . . . . . . . 415
5.15 A Unilateral Problem . . . . . . . . . . . . . . . . . . . . 425
5.16 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 431

Appendix A Hints for Selected Exercises 433


September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

x Advanced Engineering Analysis

Bibliography 483

Index 485
September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

Chapter 1

Basic Calculus of Variations

1.1 Introduction

Optimization is a universal goal. Students would like to learn more, receive


better grades, and have more free time; professors (at least some of them)
would like to give better lectures, see students learn more, receive higher
pay, and have more free time. These are the optimization problems of real
life. In mathematics, optimization makes sense only when formulated in
terms of a function f (x) or other expression. One then seeks the mini-
mum value of the expression. (It suffices to discuss minimization because
maximizing f is equivalent to minimizing −f .)
This book treats the minimization of functionals. The notion of func-
tional generalizes that of function. Although the process of generalization
does yield results of greater generality, as a rule the results are not sharper
in particular cases. So to understand what can be expected from the calcu-
lus of variations, we should review the minimization of ordinary functions.
All quantities will be assumed sufficiently differentiable for the purpose at
hand. Let us recall some terminology for the one-variable case y = f (x).

Definition 1.1. The function f (x) has a local minimum at a point x0 if


there is a neighborhood (x0 − d, x0 + d) in which f (x) ≥ f (x0 ). We call x0
the global minimum of f (x) on [a, b] if f (x) ≥ f (x0 ) holds for all x ∈ [a, b].

The necessary condition for a differentiable function f (x) to have a local


minimum at x0 is
f  (x0 ) = 0. (1.1)
A simple and convenient sufficient condition is
f  (x0 ) > 0. (1.2)

1
September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

2 Advanced Engineering Analysis

Unfortunately, no available criterion for a local minimum is both sufficient


and necessary. So the approach is to solve (1.1) for possible points of local
minimum of f (x) and then test these using an available sufficient condition.
The global minimum on [a, b] can be attained at a point of local mini-
mum. But there are two points, a and b, where (1.1) may not hold (because
the corresponding neighborhoods are one-sided) but where the global min-
imum may still occur. Hence given a differentiable function f (x) on [a, b],
we first find all xk at which f  (xk ) = 0. We then calculate f (a), f (b), and
f (xk ) at the xk , and choose the global minimum. Although this method
can be arranged as an algorithm suitable for machine computation, it still
cannot be reduced to the solution of an equation or system of equations.
These tools are extended to multivariable functions and to more com-
plex objects called functionals. A simple example of a functional is an
integral whose integrand depends on an unknown function and its deriva-
tive. Since the extension of ordinary minimization methods to functionals
is not straightforward, we continue to examine some notions from calculus.
A continuously differentiable function f (x) obeys Lagrange’s formula

f (x + h) − f (x) = f  (x + θh)h (0 ≤ θ ≤ 1). (1.3)

Continuity of f  means that

f  (x + θh) − f  (x) = r1 (x, θ, h) → 0 as h → 0,

hence

f (x + h) = f (x) + f  (x)h + r1 (x, θ, h) h

where r1 (x, θ, h) → 0 as h → 0. The term r1 (x, θ, h) h is Lagrange’s form


of the remainder. There is also Peano’s form

f (x + h) = f (x) + f  (x)h + o(h), (1.4)

which means that


f (x + h) − f (x) − f  (x)h
lim = 0.
h→0 h
The principal (linear in h) part of the increment of f is the first differ-
ential of f at x. Writing dx = h we have

df = f  (x) dx. (1.5)

“Infinitely small” quantities are not implied by this notation; here dx is a


finite increment of x (taken sufficiently small when used for approximation).
September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

Basic Calculus of Variations 3

The first differential is invariant under the change of variable x = ϕ(s):


df (ϕ(s))
df = f  (x) dx = ds, where dx = ϕ (s) ds.
ds
Lagrange’s formula extends to functions having m continuous deriva-
tives in some neighborhood of x. The extension for x + h lying in the
neighborhood is Taylor’s formula:
1 1
f (x + h) = f (x) + f  (x)h + f  (x)h2 + · · · + f (m−1) (x)hm−1
2! (m − 1)!
1 (m)
+ f (x + θh)hm (0 ≤ θ ≤ 1). (1.6)
m!
Continuity of f (m) at x yields
f (m) (x + θh) − f (m) (x) = rm (x, θ, h) → 0 as h → 0,
hence Taylor’s formula becomes
1  1 (m)
f (x + h) = f (x) + f  (x)h + f (x)h2 + · · · + f (x)hm
2! m!
1
+ rm (x, θ, h)hm
m!
with remainder in Lagrange form. The dependence of the remainder on the
parameters is suppressed in Peano’s form
1 1 (m)
f (x + h) = f (x) + f  (x)h + f  (x)h2 + · · · + f (x)hm + o(hm ). (1.7)
2! m!
The conditions of minimum (1.1)–(1.2) can be derived via Taylor’s for-
mula for a twice continuously differentiable function having
1
f (x + h) − f (x) = f  (x)h + f  (x)h2 + o(h2 ). (1.8)
2
Indeed f (x + h) − f (x) ≥ 0 if x is a local minimum. The right side has the
form ah + bh2 + o(h2 ). If a = f  (x) = 0, for example when a < 0, then for
h < h0 with sufficiently small h0 the sign of f (x + h) − f (x) is determined
by that of ah; hence for 0 < h < h0 we have f (x + h) − f (x) < 0, which
contradicts the assertion that x minimizes f . The case a > 0 is similar,
resulting in the necessary condition (1.1). The increment formula gives
1
f (x + h) − f (x) = f  (x)h2 + o(h2 ).
2
 2
The term f (x)h defines the value of the right side when h is sufficiently
close to 0, hence when f  (x) > 0 we see that for sufficiently small |h| = 0
f (x + h) − f (x) > 0.
So (1.2) is sufficient for x to be a minimum point of f .
September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

4 Advanced Engineering Analysis

A function in n variables
Consider the minimization of a function y = f (x) with x = (x1 , . . . , xn ).
More cannot be expected from this theory than from the theory of functions
in a single variable.
Definition 1.2. A function f (x) has a global minimum at the point x∗ if
the inequality
f (x∗ ) ≤ f (x∗ + h) (1.9)
holds for all nonzero h = (h1 , . . . , hn ) ∈ Rn . The point x∗ is a local
minimum if there exists ρ > 0 such that (1.9) holds whenever h =
(h21 + · · · + h2n )1/2 < ρ.
Let x∗ be a minimum point of a continuously differentiable function
f (x). Then f (x1 , x∗2 , . . . , x∗n ) is a function in one variable x1 and takes its
minimum at x∗1 . It follows that ∂f /∂x1 = 0 at x1 = x∗1 . Similarly, the rest
of the partial derivatives of f are zero at x∗ :

∂f 
= 0, i = 1, . . . , n. (1.10)
∂xi x=x ∗

This is a necessary condition of minimum for a continuously differentiable


function in n variables at the point x∗ .
To get sufficient conditions we must extend Taylor’s formula. Let f (x)
possess all continuous derivatives up to order m ≥ 2 in a ball centered at
point x, and suppose x + h lies in this ball. Fixing these, we apply (1.7) to
f (x + th) and get Taylor’s formula in the variable t:
 
df (x + th)  1 d2 f (x + th)  2
f (x + th) = f (x) +  t + 2!  t
dt dt2
t=0
 t=0
1 dm f (x + th) 
+ ··· + m
 t + o(t ).
m
m! dtm t=0
The remainder term is for the case when t → 0. From this equality for
sufficiently small t, the general Taylor formula can be derived.
The minimization problem for f (x) is studied using only the first two
terms of this formula:
 
df (x + th)  1 d2 f (x + th)  2 2
f (x + th) = f (x) +
dt  t + 2! dt2  t + o(t ). (1.11)
t=0 t=0
We calculate df (x + th)/dt as a derivative of a composite function:

df (x + th)  ∂f (x) ∂f (x) ∂f (x)
 = h1 + h2 + · · · + hn .
dt t=0 ∂x1 ∂x2 ∂xn
September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

Basic Calculus of Variations 5

The first differential is defined as


∂f (x) ∂f (x) ∂f (x)
df = dx1 + dx2 + · · · + dxn . (1.12)
∂x1 ∂x2 ∂xn
The next term,

d2 f (x + th)   n
∂ 2 f (x)
2  = hi hj ,
dt t=0 i,j=1
∂xi ∂xj

defines the second differential of f :


 n
∂ 2 f (x)
d2 f = dxi dxj . (1.13)
i,j=1
∂xi ∂xj

Taylor’s formula of the second order becomes



n
∂f (x) 1  ∂ 2 f (x)
n
2
f (x + h) = f (x) + hi + hi hj + o(h ). (1.14)
i=1
∂xi 2! i,j=1 ∂xi ∂xj

The necessary condition for a minimum, df = 0, follows from (1.11) or


(1.10). By (1.11), the condition

d2 f (x + th) 
 > 0 for any sufficiently small h
dt2 t=0

suffices for x to minimize f . The corresponding quadratic form in the


variables hi is
 2 
∂ f (x) ∂ 2 f (x)  
 ∂x2 · · · ∂x1 xn  h1
1  ∂ 2 f (x)
n
1 
 .
1
..

..   ..  .
hi hj = h1 · · · hn  .. . .   
2! i,j=1 ∂xi ∂xj 2   .
 ∂ 2 f (x) 2
∂ f (x)  hn
···
∂xn x1 ∂x2n
The n × n Hessian matrix is symmetric under our smoothness assump-
tions on f . Positive definiteness of the quadratic form can be verified via
Sylvester’s criterion.
The problem of global minimum for a function in many variables on a
closed domain Ω is more complicated than the corresponding problem for
a function in one variable. Indeed, the set of points satisfying (1.10) can
be infinite for a multivariable function. Trouble also arises concerning the
domain boundary ∂Ω: since it is no longer a finite set (unlike {a, b}) we must
also solve the problem of minimum on ∂Ω, and the structure of such a set
can be complicated. The algorithm for finding a point of global minimum
September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

6 Advanced Engineering Analysis

of a function f (x) cannot be described in several phrases; it depends on the


structure of both the function and the domain.
Issues connected with the boundary can be avoided by considering the
problem of global minimum of a function on an open domain. We will take
this approach when treating the calculus of variations. Although analogous
problems with closed domains arise in applications, the difficulties are so
great that no general results are applicable to many problems. One must
investigate each such problem separately.
Constraints of the form
gi (x) = 0, i = 1, . . . , m, (1.15)
permit reduction of constrained minimization to an unconstrained problem
provided we can solve (1.15) and get
xk = ψk (x1 , . . . , xn−m ), k = n − m + 1, . . . , n.
Substitution into f (x) would yield an ordinary unconstrained minimization
problem for a function in n − m variables
f (x1 , . . . , xn−m , . . . , ψn (x1 , . . . , xn−m )).
The resulting system of equations is nonlinear in general. This situation can
be circumvented by the use of Lagrange multipliers. The method proceeds
with formation of the Lagrangian function

m
L(x1 , . . . , xn , λ1 , . . . , λm ) = f (x) + λj gj (x), (1.16)
j=1

by which the constraints gj are adjoined to f . Then the xi and λi are all
treated as independent, unconstrained variables. The resulting necessary
conditions form a system of n + m equations in the n + m unknowns xi , λj :
∂f (x)  ∂gj (x)
m
+ λj = 0, i = 1, . . . , n,
∂xi j=1
∂xi

gj (x) = 0, j = 1, . . . , m. (1.17)

Functionals
The kind of dependence in which a real number corresponds to another
(or to a finite set) is not enough to describe many natural processes. Ar-
eas such as physics and biology spawn formulations not amenable to such
description. Consider the deformations of an airplane in flight. At some
September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

Basic Calculus of Variations 7

point near an engine, the deformation is not merely a function of the force
produced by the engine — it also depends on the other engines, air resis-
tance, and passenger positions and movements (hence the admonition that
everyone remain seated during potentially dangerous parts of the flight).
In general, many real processes in a body are described by the dependence
of the displacement field (e.g., the field of strains, stresses, heat, voltage)
on other fields (e.g., loads, heat radiation) in the same body. Each field is
described by one or more functions, so the dependence is that of a func-
tion uniquely defined by a set of other functions acting as whole objects
(arguments). A dependence of this type, provided we specify the classes to
which all functions belong, is called an operator (or map, or sometimes just
a “function” again). Problems of finding such dependences are often formu-
lated as boundary or initial-boundary value problems for partial differential
equations. These and their analysis form the main content of any course
in a particular science. Since a full description of any process is complex,
we usually work with simplified models that retain only essential features.
However, even these can be quite challenging when we seek solutions.
Humans often try to optimize their actions through an intuitive — not
mathematical — approach to fuzzily-posed problems on minimization or
maximization. This is because our nature reflects the laws of nature in
total. In physics there are quantities, like energy and enthalpy, whose val-
ues in the state of equilibrium or real motion are minimal or maximal in
comparison with other “nearby admissible” states. Younger sciences like
mathematical biology attempt to follow suit: when possible they seek to
describe system behavior through the states of certain fields of parameters,
on which functions of energy type attain maxima or minima. The energy
of a system (e.g., body or set of interacting bodies) is characterized by a
number which depends on the fields of parameters inside the system. Thus
the dependence described by quantities of energy type is such that a numer-
ical value E is uniquely defined by the distribution of fields of parameters
characterizing the system. We call this sort of dependence a functional. Of
course, in mathematics we must also specify the classes to which the above
fields may belong. The notion of functional generalizes that of function so
that the minimization problem remains sensible. Hence we come to the
object of investigation of our main subject: the calculus of variations. In
actuality we shall consider a somewhat restricted class of functionals. (Op-
timization of general functionals belongs to mathematical programming, a
younger science that contains the calculus of variations — a subject some
300 years old — as a special case.) In the calculus of variations we min-
September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

8 Advanced Engineering Analysis

imize functionals of integral type. A typical problem involves the total


energy functional for an elastic membrane under load F = F (x, y):
2 2 
1 ∂u ∂u
E(u) = a + dx dy − F u dx dy.
2 S ∂x ∂y S

Here u = u(x, y) is the deflection of a point (x, y) of the membrane, which


occupies a domain S and has tension described by parameter a (we can
put a = 1 without loss of generality). For a membrane with fixed edge, in
equilibrium E(u) takes its minimal value relative to all other admissible (or
virtual ) states. (An “admissible” function takes appointed boundary values
and is sufficiently smooth, in this case having first and second continuous
derivatives in S.) The equilibrium state is described by Poisson’s equation
∆u = −F. (1.18)
Let us also supply the boundary condition

u∂S = φ. (1.19)
The problem of minimizing E(u) over the set of smooth functions satisfying
(1.19) is equivalent to the boundary value problem (1.18)–(1.19). Analogous
situations arise in many other sciences. Eigenfrequency problems can also
be formulated within the calculus of variations.
Other interesting problems come from geometry. Consider the following
isoperimetric problem:
Of all possible smooth closed curves of unit length in the
plane, find the equation of that curve L which encloses the
greatest area.
With r = r(φ) the polar equation of a curve, we seek to have
 2

2
dr 1 2π 2
r + dφ = 1, r dφ → max.
0 dφ 2 0
Notice how we denoted the problem of maximization. Every high school
student knows the answer, but certainly not the method of solution.
We cannot list all problems solvable by the calculus of variations. It is
safe to say only that the relevant functionals possess an integral form, and
that the integrands depend upon unknown functions and their derivatives.
Again, we can suppose that the theory for minimizing a functional
should represent an extension of the theory for minimizing a multivari-
able function. As in the latter theory, we must appoint a domain on which
September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

Basic Calculus of Variations 9

the functional is determined. Even for a multivariable function, this is not


always an easy task. For the functional it is much harder, as the arguments
now belong to certain classes of functions, and the answer can depend on
the class as well as the detailed calculations we perform. The study of
function spaces falls under the heading of functional analysis, considered in
Chapter 4. General description of the domains of functionals can be under-
taken via normed spaces of functions. The classical calculus of variations
arose long before functional analysis, and dealt with the classes of continu-
ously differentiable (or n-times continuously differentiable) functions under
certain conditions on the boundary of the integration domain.
We expect the notions of local minimum and global minimum to appear
in the study of functionals. A definition of local minimum will require a
precise notion of a neighborhood of the minimizing function. In this case
functional analytic ideas are quite helpful. As we said, however, the calculus
of variations predated functional analysis. The notion of a neighborhood
of a function was developed in the calculus of variations and later inherited
by functional analysis.
The necessary conditions (1.10) can be suitably extended to the problem
of minimum for a functional. We will see this explicitly when we approx-
imate the functional with a function in n variables. But for the complete
treatment of a functional, the conditions should be given at any point along
the minimizing function. These conditions are known as Euler equations
or Euler–Lagrange equations. They are obtained when the minimizer lies
inside the domain of the functional (i.e., the minimizer should lie some
distance away from the boundary of the domain, and this will be assumed
even if not stated).
Finally, the Euler equation for a functional represents only a necessary
condition for a minimum. Sufficient conditions are more subtle and require
separate investigation. However, in certain physical problems (such as those
associated with linear models in continuum mechanics) where a point of
minimum total potential energy is sought, we obtain a unique extremum
that automatically turns out to be a minimum.
In the next section, we show how the problem of minimum for one
special functional is related to the problem of minimum for a multivariable
function.
September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

10 Advanced Engineering Analysis

Minimization of a simple functional using calculus


Consider a general functional of the form
b
F (y) = f (x, y, y  ) dx, (1.20)
a

where y = y(x) is smooth. (At this stage we do not stop to formulate


strict conditions on the functions involved; we simply assume they have
as many continuous derivatives as needed. Nor do we clearly specify the
neighborhood of a function for which it is a local minimizer of a functional.)
From the time of Newton’s Principia, mathematical physics has formu-
lated and considered each problem so that it has a solution which, at least
under certain conditions, is unique. Although the idea of determinism in
nature was buried by quantum mechanics, it remained an important part
of the older subject of the calculus of variations. We know that for the
equilibrium problem for a membrane to have a unique solution, we must
impose boundary conditions. So let us first understand whether the prob-
lem of minimum for (1.20) is well-posed; i.e., whether (at least for simple
particular cases) a solution exists and is unique.
The particular form
b 
1 + (y  )2 dx
a

yields the length of the plane curve y = y(x) from (a, y(a)) to (b, y(b)).
The obvious minimizer is a straight line y = kx + d. Without boundary
conditions (i.e., with y(a) or y(b) unspecified), k and d are arbitrary and
the solution is not unique. We can impose no more than two restrictions
on y(x) at the ends a and b, because y = kx + d has only two indefinite
constants. However, the problem without boundary conditions also makes
sense; its solution is the set of horizontal segments y = d starting at the
vertical line x = a and ending at x = b.
Problem setup is a tough yet important issue in mathematics. We shall
eventually face the question of how to pose the main problems of the cal-
culus of variations in a sensible fashion.
Let us consider the problem of minimum of (1.20) without additional
restrictions, and attempt to solve it using calculus. Discretization, in this
case the approximation of the integral by a Riemann sum, will reduce the
functional to a multivariable function. In the calculus of variations other
methods of investigation are customary; however, the current approach
is instructive because it leads to some central results of the calculus of
September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

Basic Calculus of Variations 11

variations and shows that certain important ideas are extensions of ordinary
calculus.
We begin by subdividing [a, b] into n partitions each of length
b−a
h= .
n
Denote xi = a + ih and yi = y(xi ), so y0 = y(a) and yn = y(b). Take an
approximate value of y  (xi ) as
yi+1 − yi
y  (xi ) ≈ .
h
Approximation of (1.20) by the Riemann sum
b 
n−1
f (x, y, y  ) dx ≈ h f (xk , yk , y  (xk )) (1.21)
a k=0

gives
b 
n−1
f (x, y, y  ) dx ≈ h f (xk , yk , (yk+1 − yk )/h)
a k=0

= Φ(y0 , . . . , yn ). (1.22)

Since Φ(y0 , . . . , yn ) is an ordinary function in n + 1 independent variables,


we set
∂Φ(y0 , y1 , . . . , yn )
= 0, i = 0, . . . , n. (1.23)
∂yi
Again, any function f encountered is assumed to possess all needed deriva-
tives. Henceforth we denote partial derivatives using
∂f ∂f ∂f
fy = , fy  = , fx = , (1.24)
∂y ∂y  ∂x
and the total derivative using
df (x, y(x), y  (x))
= fx (x, y(x), y  (x)) + fy (x, y(x), y  (x)) y  (x)
dx
+ fy (x, y(x), y  (x)) y  (x). (1.25)

Observe that in the notation fy we regard y  as the name of a simple


variable; we temporarily ignore its relation to y and even its status as a
function in its own right.
September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

12 Advanced Engineering Analysis

Consider the structure of (1.23). The variable yi appears in the sum


(1.22) only once when i = 0 or i = n, twice otherwise. In the latter case
(1.23) gives, using the chain rule and omitting the factor h,
 
yi − yi−1 yi+1 − yi
fy xi−1 , yi−1 , fy xi , yi ,
h h

h h

yi+1 − yi
+ fy xi , yi , = 0. (1.26)
h
For i = 0 the result is
 
y1 − y0
 fy  x0 , y0 ,
 y1 − y0 h 
h
fy x0 , y0 , − =0

h h

or
 
y1 − y0 y1 − y0
f y x0 , y0 , − h fy x0 , y0 , = 0. (1.27)
h h
For i = n we obtain

yn − yn−1
f y xn−1 , yn−1 , = 0. (1.28)
h
In the limit as h → 0, (1.27) and (1.28) give, respectively,
 
fy (x, y(x), y  (x))x=a = 0, fy (x, y(x), y  (x))x=b = 0.

Finally, considering the first two terms in (1.26) for 0 < i < n,
 
yi+1 − yi yi − yi−1
fy xi , yi ,
 − fy xi−1 , yi−1 ,

h h
− ,
h
we recognize an approximation for the total derivative −dfy /dx at yi−1 .
Hence (1.26), after h → 0 in such a way that xi−1 remains a fixed value c,
reduces to
d
fy − fy  = 0 (1.29)
dx
at x = c. A nonuniform partitioning will yield this equation similarly for
any x = c ∈ (a, b). In expanded form (1.29) is

fy − fy x − fy y y  − fy y y  = 0, x ∈ (a, b). (1.30)


September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

Basic Calculus of Variations 13

The limit passage has given us this second-order ordinary differential equa-
tion and two boundary conditions
 
fy x=a = 0, fy x=b = 0. (1.31)

Equations (1.29) and (1.31) play the same role for the functional (1.20) as
equations (1.10) play for a function in many variables. In the absence of
boundary conditions on y(x), we get necessarily two boundary conditions
for a function on which (1.20) attains a minimum.
Since the resulting equation is of second order, no more than two bound-
ary conditions can be imposed on its solution (see, however, Remark 1.20).
We could, say, fix the ends of the curve y = y(x) by putting

y(a) = c0 , y(b) = c1 . (1.32)

If we repeat the above process under this restriction we get (1.26) and cor-
respondingly (1.29), whereas (1.31) is replaced by (1.32). We can consider
the problem of minimum of this functional on the set of functions satisfying
(1.32). Then the necessary condition which a minimizer should satisfy is
the boundary value problem consisting of (1.29) and (1.32).
Conditions such as y(a) = 0 and y  (a) = 0 are normally posed for
a Cauchy problem involving a second-order differential equation. In the
present case, however, a repetition of the above steps implies the addi-
tional restriction fy |x=b = 0. A problem for (1.29) with three boundary
conditions is, in general, inconsistent.
We have obtained some possible ways to set up the problem of minimum
of the functional (1.20).

Notation for various types of derivatives


It will be necessary to take derivatives of composite functions. When such
functions are integrated by parts, we encounter “total derivatives” that
must be distinguished from the usual partial derivatives. We denote total
derivatives in the same way as ordinary derivatives, using the differential
symbol d: therefore d/dx will denote a total derivative with respect to x.
We often denote partial derivatives by subscripts so that ∂(·)/∂x will be
denoted by (·)x or sometimes (·)1 . Let us consider two common cases.

1. Suppose

f = f (x, y(x), y  (x))


September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

14 Advanced Engineering Analysis

so that f depends on x through (1) an independent variable x, and (2) the


variables p = y(x) and q = y  (x) that are each functions of x as well. We
will denote the partial derivative with respect to x as

∂ 
fx = f (x, p, q)
∂x 
p=y(x), q=y (x)

where, during differentiation, we regard p and q as independent variables.


Other partial derivatives are
 
∂  ∂ 
fy = 
f (x, p, q) , fy  = f (x, p, q) .
∂p 
p=y(x), q=y (x) ∂q  p=y(x), q=y (x)

The total derivative with respect to x, denoted d/dx, arises when we dif-
ferentiate while considering y(x) and y  (x) to be functions of x. The total
derivative of the partial derivative fy is, by the chain rule,
d d
fy  ≡ fy (x, y(x), y  (x)) = fy x + fy y y  + fy y y  ,
dx dx
where, for example,

∂ ∂ 
fy y =
 f (x, p, q) .
∂p ∂q p=y(x), q=y  (x)

2. Consider the composite function


f = f (x, y, u(x, y), ux (x, y), uy (x, y))
depending on independent variables x, y and on a function u and its deriva-
tives, which depend on x, y as well. Now we denote
p = u(x, y), q = ux (x, y), r = uy (x, y),
where ux and uy are partial derivatives with respect to x and y, respec-
tively. Introducing variables p, q, r, we get a function f = f (x, y, p, q, r)
in five independent variables. The following notations are used for partial
derivatives:

∂ 
fx = f (x, y, p, q, r) ,
∂x p=u(x,y), q=ux (x,y), r=uy (x,y)


∂ 
fy = f (x, y, p, q, r) ,
∂y p=u(x,y), q=ux (x,y), r=uy (x,y)


∂ 
fu = f (x, y, p, q, r) ,
∂p p=u(x,y), q=ux (x,y), r=uy (x,y)
September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

Basic Calculus of Variations 15


∂ 
fux = f (x, y, p, q, r) ,
∂q p=u(x,y), q=ux (x,y), r=uy (x,y)

and

∂ 
fuy = f (x, y, p, q, r) .
∂r p=u(x,y), q=ux (x,y), r=uy (x,y)

Finally, let us display the notation for the total derivative d/dx of fux ,
where f denotes f = f (x, y, p, q, r):

d 
fu = fqx + fqp ux + fqq uxx + fqr uyx  ,
dx x p=u(x,y), q=ux (x,y), r=uy (x,y)

and a similar formula for the total derivative with respect to y:



d 
fux = fqy + fqp uy + fqq uxy + fqr uyy  .
dy p=u(x,y), q=ux (x,y), r=uy (x,y)

The formulas for higher derivatives are denoted similarly.

Brief summary of important terms


A functional is a correspondence assigning a real number to each function
in some class of functions. The calculus of variations is concerned with
variational problems: i.e., those in which we seek the extrema (maxima or
minima) of functionals.
An admissible function for a given variational problem is a function that
satisfies all the constraints of that problem.
A function is sufficiently smooth for a particular development if all re-
quired actions (e.g., differentiation, integration by parts) are possible and
yield results having the properties needed for that development.

1.2 Euler’s Equation for the Simplest Problem

We begin with the problem of local minimum of the functional


b
F (y) = f (x, y, y  ) dx (1.33)
a

on the set of functions y = y(x) that satisfy the boundary conditions

y(a) = c0 , y(b) = c1 . (1.34)


September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

16 Advanced Engineering Analysis

The existence of a solution can depend on the properties of this set. We


must compare the values of F (y) on all functions y satisfying (1.34). In
view of (1.29) it is reasonable to seek minimizers that have continuous first
and second derivatives on [a, b]. How should we specify a neighborhood of
a function y(x)? Since all admissible functions must satisfy (1.34), we can
consider the set of functions of the form y(x) + ϕ(x) where

ϕ(a) = ϕ(b) = 0. (1.35)

With the intention of using tools close to those of classical calculus,


we first introduce the idea of continuity of a functional with respect to an
argument which, in turn, is a function on [a, b]. A suitably modified version
of the classical definition of function continuity is as follows: given any small
ε > 0, there exists a δ-neighborhood of y(x) such that when y(x) + ϕ(x)
belongs to this neighborhood we have

|F (y + ϕ) − F (y)| < ε.

If the neighborhood of the zero function is specified by the inequality

max |ϕ(x)| + max |ϕ (x)| < δ, (1.36)


x∈[a,b] x∈[a,b]

the definition can become workable when f (x, y, y  ) is continuous in the


three independent variables x, y, y  . This is not the only possible definition
of a neighborhood; later we shall discuss other possibilities. But one benefit
is that the left side of (1.36) contains the expression usually used to define
the norm on the set of all functions continuously differentiable on [a, b]:

ϕ(x)C (1) (a,b) = max |ϕ(x)| + max |ϕ (x)|. (1.37)


x∈[a,b] x∈[a,b]

Definition 1.3. The space C (1) (a, b) is the normed space consisting of
the set of all functions ϕ(x) that are continuously differentiable on [a, b],
supplied with the norm (1.37). Its subspace of functions satisfying (1.35) is
(1)
denoted C0 (a, b). The set of all functions having k continuous derivatives
on [a, b] is denoted C (k) (a, b).

In many books these spaces are denoted by C (k) ([a, b]) to emphasize
that [a, b] is closed. To keep our notation reasonable throughout the book,
we introduce

Convention 1.4. In cases where no ambiguity should arise, we typically


abbreviate the space designation subscript on a norm symbol. 
September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

Basic Calculus of Variations 17

For example, the notation ·C (1) (a,b) (where the dot stands for the
argument of the norm operation) is shortened to · in the present section.
At times, only some aspect of the full label can be suppressed. For example,
we may use the notation ·C (1) if only the domain [a, b] is understood. With
this convention in mind let us proceed to

Definition 1.5. A δ-neighborhood of y(x) of admissible functions is the set


of all functions of the form y(x) + ϕ(x) where ϕ(x) is such that ϕ(x) ∈
(1)
C0 (a, b) and ϕ(x) < δ.

When no boundary conditions are imposed on y, then the definition of


δ-neighborhood does not require ϕ to vanish at the endpoints.

Definition 1.6. A function y(x) is a point of local minimum of F (y) on


the set satisfying (1.34) if there is a δ-neighborhood of y(x), i.e., a set of
(1)
functions z(x) such that z(x) − y(x) ∈ C0 (a, b) and z(x) − y(x) < δ, in
which F (z)− F (y) ≥ 0. If in a δ-neighborhood the relation F (z)− F (y) > 0
holds for all z(x) = y(x), then y(x) is a point of strict local minimum.

We may speak of more than one type of local minimum. According to


Definition 1.6, a function y is a minimum if there is a δ such that

F (y + ϕ) − F (y) ≥ 0 whenever ϕC (1) (a,b) < δ.


0

Historically this type of minimum is called “weak” and we shall use only this
type and simply call it a minimum. Those who pioneered the calculus of
variations also considered “strong” local minima, defining these as values of
y for which there is a δ such that F (y+ϕ) ≥ F (y) whenever ϕ(a) = ϕ(b) = 0
and max |ϕ| < δ on [a, b]. Here the modified condition on ϕ permits “strong
variations” into consideration: i.e., functions ϕ for which ϕ may be large
even though ϕ itself is small. Note that when we “weaken” the condition
(1)
on ϕ by changing the norm from the norm of C0 (a, b) to the norm of
C0 (a, b) which contains only ϕ and not ϕ , we simultaneously strengthen the
statement made regarding y when we assert the inequality F (y +ϕ) ≥ F (y).
Let us turn to a rigorous justification of (1.29). We restrict the class
of possible integrands f (x, y, z) of (1.33) to the set of functions that are
continuous in (x, y, z) when x ∈ [a, b] and |y−y(x)|+|z−y (x)| < δ. Suppose
the existence of a minimizer y(x) for F (y) (see, however, Remark 1.13 on
(1)
page 21). Consider F (y + tϕ) for an arbitrary but fixed ϕ(x) ∈ C0 (a, b).
It is a function in the single variable t, taking its minimum at t = 0. If it
September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

18 Advanced Engineering Analysis

is differentiable then

dF (y + tϕ) 
 = 0. (1.38)
dt t=0

To justify differentiation under the integral sign, let f (x, y, y  ) be contin-


uously differentiable in the variables y and y  . But, since (1.30) shows
that we shall need the existence of other derivatives of f as well, let us
assume f (x, y, y  ) is twice continuously differentiable, in any combination
of its arguments, in the domain of interest. By the chain rule, (1.38) yields

d b 
0= f (x, y + tϕ, y + tϕ ) dx 
 
dt a t=0
b
= [fy (x, y, y  )ϕ + fy (x, y, y  )ϕ ] dx. (1.39)
a

Definition 1.7. The right member of (1.39) is denoted δF (y, ϕ) and called
the first variation of the functional (1.33).

Integration by parts in the second term on the right in (1.39) gives


b b
d
fy (x, y, y  )ϕ dx = − ϕ fy (x, y, y  ) dx
a a dx

where the boundary terms vanish by (1.35). It follows that


b 
d
fy (x, y, y  ) − fy (x, y, y  ) ϕ dx = 0. (1.40)
a dx

In the integrand we see the left side of (1.29). To deduce (1.29) from (1.40)
we need the fundamental lemma of the calculus of variations.

Lemma 1.8. Let g(x) be continuous on [a, b], and let


b
g(x)ϕ(x) dx = 0 (1.41)
a

hold for every function ϕ(x) that is differentiable on [a, b] and vanishes in
some neighborhoods of a and b. Then g(x) ≡ 0.

Proof. Suppose to the contrary that (1.41) holds while g(x0 ) = 0 for
some x0 ∈ (a, b). Without loss of generality we may assume g(x0 ) > 0. By
continuity we have g(x) > 0 in a neighborhood [x0 − ε, x0 + ε] ⊂ (a, b). It is
September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

Basic Calculus of Variations 19

easy to construct a nonnegative bell-shaped function ϕ0 (x) such that ϕ0 (x)


is differentiable, ϕ0 (x0 ) > 0, and ϕ0 (x) = 0 outside (x0 − ε, x0 + ε):
 

exp ε2
, |x − x0 | < ε,
ϕ(x) = (x − x0 )2 − ε2

 0, |x − x0 | ≥ ε.

See Fig. 1.1. The product g(x)ϕ0 (x) is nonnegative everywhere and positive
b
near x0 . Hence a g(x)ϕ(x) dx > 0, a contradiction. 

x 0- ε x0 x 0+ ε x

Fig. 1.1 Bell-shaped function for the proof of Lemma 1.8.

It is possible to further restrict the class of functions ϕ(x) in Lemma 1.8.

Lemma 1.9. Let g(x) be continuous on [a, b], and let (1.41) hold for any
function ϕ(x) that is infinitely differentiable on [a, b] and vanishes in some
neighborhoods of a and b. Then g(x) ≡ 0.

The proof is the same as that for Lemma 1.8: it is necessary to con-
struct the same bell-shaped function ϕ(x) that is infinitely differentiable.
This form of the fundamental lemma provides a basis for the theory of gen-
eralized functions or distributions. These are linear functionals on the sets
of infinitely differentiable functions, and arise as elements of the Sobolev
spaces to be discussed later.
Now we can formulate the main result of this section.

Theorem 1.10. Suppose y = y(x) ∈ C (2) (a, b) locally minimizes the func-
tional (1.33) on the subset of C (1) (a, b) consisting of those functions satis-
fying (1.34). Then y(x) is a solution of the equation
d
fy − fy = 0. (1.42)
dx
September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

20 Advanced Engineering Analysis

Proof. Under the assumptions of this section (including that f (x, y, y  )


is twice continuously differentiable in its arguments), the bracketed term in
(1)
(1.40) is continuous on [a, b]. Since (1.40) holds for any ϕ(x) ∈ C0 (a, b),
Lemma 1.8 applies. 

Definition 1.11. Equation (1.42) is known as the Euler equation, and a


solution y = y(x) is called an extremal of (1.33). A functional is stationary
if its first variation vanishes.

Taken together, (1.42) and (1.34) constitute a boundary value problem


for the unknown y(x).

Example 1.12. Find a function ȳ = ȳ(x) that minimizes the functional


1
F (y) = [y 2 + (y  )2 − 2y] dx
0

subject to the conditions y(0) = 1 and y(1) = 0.


Solution. Here f (x, y, y  ) = y 2 + (y  )2 − 2y, so we obtain
fy = 2y − 2, fy = 2y  ,
and the Euler equation is
y  − y + 1 = 0.
Subject to the given boundary conditions, the solution is
ex − e−x
ȳ(x) = 1 − .
e − e−1
We stress that this is an extremal: only supplementary investigation can
determine whether it is an actual minimizer of F (y). Consider the difference
F (ȳ + ϕ) − F (ȳ) where ϕ(x) vanishes at x = 0, 1. It is easily shown that
1
F (ȳ + ϕ) − F (ȳ) = [ϕ2 + (ϕ )2 ] dx ≥ 0,
0

so the global minimum of F (y) really does occur at ȳ(x). Although such
direct verification is not always straightforward, a large class of important
problems in mechanics (e.g., problems of equilibrium for linearly elastic
structures under conservative loads) yield single extremals that minimize
their corresponding total energy functionals. This happens because of the
quadratic structure of the functional, as in the present example. 

Certain forms of f lead to simplification of the Euler equation:


September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

Basic Calculus of Variations 21

(1) If f does not depend explicitly on y, then fy = constant.


(2) If f does not depend explicitly on x, then f − fy y  = constant.
(3) If f depends explicitly on y  only and fy y = 0, then y(x) = c1 x + c2 .

Remark 1.13. On page 17 we assumed the existence of a minimizer. This


can lead to incorrect conclusions, and it is normally necessary to prove the
existence of an object having needed properties. Perron’s paradox illus-
trates the trouble we may encounter by supposing the existence of a nonex-
istent object. Suppose there exists a greatest positive integer N . Since N 2
is also a positive integer we must have N 2 ≤ N , from which it follows that
N = 1. If we knew nothing about the integers we might believe this result
and attempt to base an entire theory on it. 

1.3 Properties of Extremals of the Simplest Functional

While attempting to seek a minimizer on a subset of C (1) (a, b), we imposed


the illogical restriction that it must belong to C (2) (a, b) (note that f does
not depend on y  ). Let us consider how to circumvent this requirement.

Lemma 1.14. Let g(x) be a continuous function on [a, b] for which the
(1)
following equality holds for every ϕ(x) ∈ C0 (a, b):
b
g(x)ϕ (x) dx = 0. (1.43)
a

Then g(x) is constant.


b
Proof. For a constant c it is clear that a cϕ (x) dx = 0 whenever ϕ(x) ∈
(1)
C0 (a, b). So g(x) can be an arbitrary constant. We show that there are
no other forms for g. From (1.43) it follows that
b
[g(x) − c]ϕ (x) dx = 0. (1.44)
a
b x
Take c = c0 = (b − a)−1 a g(x) dx. The function ϕ(x) = a [g(s) − c0 ] ds
is continuously differentiable and satisfies ϕ(a) = ϕ(b) = 0. Hence we can
put it into (1.44) and obtain
b
[g(x) − c0 ]2 dx = 0,
a

from which g(x) ≡ c. 


September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

22 Advanced Engineering Analysis

Lemma 1.14 provides a necessary condition for a relative minimum.

Theorem 1.15. Suppose y = y(x) ∈ C (1) (a, b) locally minimizes (1.33)


on the subset of functions in C (1) (a, b) satisfying (1.34). Then y(x) is a
solution of the following equation, where c is a constant:
x
fy (s, y(s), y  (s)) ds − fy (x, y(x), y  (x)) = c. (1.45)
0

Proof. Let us return to the equality (1.39),


b
[fy (x, y, y  )ϕ + fy (x, y, y  )ϕ ] dx = 0,
a

which is valid here as well. Integration by parts gives


b b x
fy (x, y(x), y  (x))ϕ(x) dx = − fy (s, y(s), y  (s)) ds ϕ (x) dx.
a a a

The boundary terms were zero by (1.35). It follows that


b x 
− fy (s, y(s), y (s)) ds + fy (x, y(x), y (x)) ϕ (x) dx = 0.



a a

(1)
This holds for all ϕ(x) ∈ C0 (a, b). So by Lemma 1.14 we have (1.45). 

The integro-differential equation (1.45) has been called the Euler equa-
tion in integrated form.

Corollary 1.16. If

fy y (x, y(x), y  (x)) = 0

along a minimizer y = y(x) ∈ C (1) (a, b) of (1.33), then y(x) ∈ C (2) (a, b).

Proof. Rewrite (1.45) as


x
fy (x, y(x), y  (x)) = fy (s, y(s), y  (s)) ds − c.
0

The function on the right is continuously differentiable for any y = y(x) ∈


C (1) (a, b). Thus we can differentiate both sides of the last identity with
respect to x and obtain

fy x + fy y y  + fy y y  = a continuous function.

Considering the term with y  (x) on the left, we prove the claim. 
September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

Basic Calculus of Variations 23

It follows that under the condition of the corollary equations (1.42) and
(1.45) are equivalent; however, this is not the case when fy y (x, y(x), y  (x))
can be equal to zero on a minimizer y = y(x). Since y  (x) does not appear
in (1.45), it can be considered as defining a generalized solution of (1.42).
At times it becomes clear that we should change variables and consider a
problem in another coordinate frame. For example, if we consider geodesic
lines on a surface of revolution, then cylindrical coordinates may seem more
appropriate than Cartesian coordinates. For the problem of minimum of a
functional we have two objects: the functional itself, and the Euler equation
for this functional. Let y = y(x) satisfy the Euler equation in the original
frame. Let us change variables, for example from (x, y) to (u, v):
x = x(u, v), y = y(u, v). (1.46)
The forms of the functional and its Euler equation both change. Next we
change variables for the extremal y = y(x) and get a curve v = v(u) in the
new variables. Is v = v(u) an extremal for the transformed functional? It
is, provided the transformation does not degenerate in some neighborhood
of the curve y = y(x): that is, if the Jacobian
 
x x 
J =  u v  = 0
yu yv
there. This property is called the invariance of the Euler equation. Roughly
speaking, we can change all the variables of the problem at any stage of
the solution and get the same solutions in the original coordinates. This
invariance is frequently used in practice. We shall not stop to consider the
issue of invariance for each type of functional we treat, but the results are
roughly the same.
We have derived a necessary condition for a function to be a point
of minimum or maximum of (1.33). Other functionals will be treated in
the sequel. An Euler equation is the starting point for any variational
investigation of a physical problem, and in practice its solution is often
approached numerically. Let us consider some methods relevant to (1.33).

1.4 Ritz’s Method

We now consider a numerical approach to minimizing the functional (1.33)


with boundary conditions (1.34). Corresponding techniques for other prob-
lems will be presented later; we shall benefit from a consideration of this
simple problem, however, since the main ideas will be the same.
September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

24 Advanced Engineering Analysis

In § 1.1 we obtained the Euler equation for (1.33). The intermediate


equations (1.26) with boundary conditions (1.27)–(1.28), which for this
case must be replaced by the Dirichlet conditions
y(a) = y0 = d0 , y(b) = yn = d1 ,
present us with a finite difference variational method for solving the problem
(1.42), (1.34), belonging to a class of numerical methods based on repre-
senting the derivatives of y(x) in finite-difference form and the functional
as a finite sum. These methods differ in how the functions and integrals
are discretized. Despite widespread application of the finite element and
boundary element methods, the finite-difference variational methods remain
useful because of certain advantages they possess.
Other methods for minimizing a functional, and hence of solving certain
boundary value problems, fall under the heading of Ritz’s method. Included
are modifications of the finite element method. Ritz’s method was popular
before the advent of the computer, and remains so, because it can yield
accurate results for complex problems that are difficult to solve analytically.
The idea of Ritz’s method is to reduce the problem of minimizing (1.33)
on the space of all continuously differentiable functions satisfying (1.34)
to the problem of minimizing the same functional on a finite dimensional
subspace of functions that can approximate the solution. Formerly, the
necessity of doing manual calculations forced engineers to choose such sub-
spaces quite carefully, since it was important to get accurate results in as
few calculations as possible. The choice of subspace remains an important
issue because a bad choice can lead to computational instability.
In Ritz’s method we seek a solution to the problem of minimization of
the functional (1.33), with boundary conditions (1.34), in the form

n
yn (x) = ϕ0 (x) + ck ϕk (x). (1.47)
k=1

Here ϕ0 (x) satisfies (1.34); a common choice is the linear function ϕ0 (x) =
αx + β with
d1 − d0 bd0 − ad1
α= , β= . (1.48)
b−a b−a
The remaining functions, called basis functions, satisfy the homogeneous
conditions
ϕk (a) = ϕk (b) = 0, k = 1, . . . , n.
The ck are constants.
September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

Basic Calculus of Variations 25

Definition 1.17. The function yn∗ (x) that minimizes (1.33) on the set of
all functions of the form (1.47) is called the nth Ritz approximation.

The Ritz approximations satisfy the boundary conditions (1.34) auto-


matically. The above mentioned subspace is the space of functions of the
n
form k=0 ck ϕk (x). For a numerical solution it is necessary that the ϕk (x)
be linearly independent, which means that


n
ck ϕk (x) = 0 only if ck = 0 for k = 1, . . . , n.
k=1

For manual calculation this was supplemented by the requirement that a


small value of n — say 1, 2, or 3 at most — would suffice. The requirement
could be met since the corresponding boundary value problems described
real objects, such as bent beams, whose shapes under load were understood.
Now, to provide a theoretical justification of the method, we require that
the system {ϕk (x)}∞ k=1 be complete. This means that given any y = g(x) ∈
(1) n
C0 (a, b) and ε > 0 we can find a finite sum k=1 ck ϕk (x) such that


n
g(x) − ck ϕk (x) < ε.
k=1

Here the norm is defined by (1.37). It is sometimes required that


{ϕk (x)}∞k=1 be a basis of the corresponding space, but this is not needed
for either the justification of the method or its numerical realization.
We therefore arrive at the problem of minimizing the functional

b
f (x, yn , yn ) dx
a

where yn (x) is given by (1.47). The unknowns are the ck , so the functional
becomes a function in n real variables:
b
Φ(c1 , . . . , cn ) = f (x, yn , yn ) dx.
a

To minimize this we solve the system

∂Φ(c1 , . . . , cn )
= 0, k = 1, . . . , n. (1.49)
∂ck
September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

26 Advanced Engineering Analysis

Denoting c0 = 1, we have
b
∂Φ(c1 , . . . , cn ) ∂
= f (x, yn , yn ) dx
∂ck ∂ck a
! n "
∂ b  
n

= f x, ci ϕi (x), ci ϕi (x) dx
∂ck a i=0 i=0
! "
b 
n 
n
= fy x, ci ϕi (x), ci ϕi (x) ϕk (x) dx
a i=0 i=0
! "
b 
n 
n
+ fy  x, ci ϕi (x), ci ϕi (x) ϕk (x) dx,
a i=0 i=0

hence (1.49) becomes


! n "
b  
n

fy x, ci ϕi (x), ci ϕi (x) ϕk (x) dx
a i=0 i=0
! "
b 
n 
n
+ fy  x, ci ϕi (x), ci ϕi (x) ϕk (x) dx = 0 (1.50)
a i=0 i=0

for k = 1, . . . , n. This is a system of n simultaneous equations in the n


variables c1 , . . . , cn . It is linear only if f is a quadratic form in ck ; i.e., only
if the Euler equation is linear in y(x). For methods of solving simultaneous
equations, the reader is referred to books on numerical analysis.
Note that (1.50) can be obtained in other ways. We could put y = yn
and ϕ = ϕk in (1.39), since while deriving (1.50) we used the same steps
we used in deriving (1.39). Alternatively, we could put yn into the left side
of the Euler equation,
d
fy (x, yn , yn ) − fy (x, yn , yn ), (1.51)
dx
and then require it to be “orthogonal” to each ϕk . That is, we could multi-
ply (1.51) by ϕk , integrate the result over [a, b], use integration by parts on
the term with the total derivative d/dx, and equate the result to zero. This
is opposite the way we derived (1.50). This method of approximating the
solution of the boundary value problem (1.42), (1.47) is Galerkin’s method.
In the Russian literature it is called the Bubnov–Galerkin method, because
in 1915 I.G. Bubnov, who was reviewing a paper by S.P. Timoshenko on
applications of Ritz’s method to the solution of a problem for a bending
beam, offered a brief remark on another method of obtaining the equations
September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

Basic Calculus of Variations 27

of Ritz’s method. The journal in which Timoshenko’s paper appeared hap-


pened to publish the comments of reviewers together with the papers (a
nice way to hold reviewers responsible for their comments). Hence Bubnov
became an originator of the method. Galerkin was Bubnov’s successor,
and his real achievement was the development of various forms and appli-
cations of the method. In particular, there is a modification wherein (1.51)
is multiplied not by ϕk , the functions from the representation of yn , but
by other functions ψ1 , . . . , ψn . This is sometimes a better way to minimize
the residual (1.51).
Popular basis functions ϕk for one-dimensional problems include the
trigonometric polynomials and functions of the form (x − a)(x − b)Pk (x)
where the Pk (x) are polynomials. Here the factors (x − a) and (x − b)
enforce the required homogeneous boundary conditions at x = a and x = b.
When deriving the equations of the Ritz (or Bubnov–Galerkin) method,
we imposed no special conditions on {ϕk } other than linear independence
(1)
and some smoothness: ϕk (x) ∈ C0 (a, b). In general each of the equa-
tions (1.50) contains all of the ck . By the integral nature of (1.50), if we
select basis functions so that each ϕk (x) is nonzero only on some small
part of [a, b], we get a system in which each equation involves only a sub-
set of {ϕi }. This is the background for the finite element method based
on Galerkin’s method: depending on the problem each equation involves
just a few of the ck (typically three to five). Moreover, the derivation of
Galerkin’s equations suggests that it is not necessary to have basis functions
with continuous derivatives — it suffices to take functions with piecewise
continuous derivatives of higher order (first order for the problem under
consideration) when it is possible to calculate the terms of (1.50).
Ritz’s method can yield good results using low-order approximations. A
disadvantage is that the calculations at a given step are almost independent
from those of the previous step. The ck do not change continuously from
step to step; hence, although the next step gives a better approximation,
the coefficients can change substantially. Accumulation of errors imposes
limits on the number of basis functions in practical calculations.

Example 1.18. Consider the problem


1
2
Ψ(y) = {y  (x) + [1 + 0.1 sin(x)]y 2 (x) − 2xy(x)} dx → min
0

subject to y(0) = 0 and y(1) = 10. Find the Ritz approximations for
n = 1, 3, 5 using ϕ0 (x) = 10x and the following basis sets:
September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

28 Advanced Engineering Analysis

(a) ϕk (x) = (1 − x)xk , k ≥ 1,


(b) ϕk (x) = sin kπx, k ≥ 1.

Solution. Note that ϕ0 (x) was chosen to satisfy the given boundary con-
ditions. We find the expansion coefficients ck by solving the system
! "
∂ n
Ψ ϕ0 (x) + ci ϕi (x) = 0, i = 1, . . . , n.
∂ck i=1
For brevity let us denote
1
y, z = {y  (x)z  (x) + [1 + 0.1 sin(x)]y(x)z(x)} dx
0
so that
1
Ψ(y) = y, y − 2 xy(x) dx.
0
Using the symmetry of the form y, z we write out Ritz’s equations:
1
c1 ϕ1 , ϕ1 + c2 ϕ2 , ϕ1 + · · · + cn ϕn , ϕ1 = − ϕ0 , ϕ1 + xϕ1 (x) dx,
0
1
c1 ϕ1 , ϕ2 + c2 ϕ2 , ϕ2 + · · · + cn ϕn , ϕ2 = − ϕ0 , ϕ2 + xϕ2 (x) dx,
0
..
.
1
c1 ϕ1 , ϕn + c2 ϕ2 , ϕn + · · · + cn ϕn , ϕn = − ϕ0 , ϕn + xϕn (x) dx.
0
(1.52)
For small n this system can be solved by hand, otherwise computer solution
is required. In the present case we find that for the first basis set the Ritz
approximations are
y1 (x) = 10x − 2.162x(1 − x),
y3 (x) = 10x + (−1.409x − 1.356x2 − 0.246x3 )(1 − x),
y5 (x) = 10x + (−1.404x − 1.404x2 − 0.140x3 − 0.063x4 − 0.007x5)(1 − x).
For the second basis set we obtain the Ritz approximations
z1 (x) = 10x − 0.289 sin πx,
z3 (x) = 10x − 0.289 sin πx + 0.063 sin 2πx − 0.017 sin 3πx,
z5 (x) = 10x − 0.289 sin πx + 0.063 sin 2πx − 0.017 sin 3πx
+ 0.008 sin 4πx − 0.004 sin 5πx,
September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

Basic Calculus of Variations 29

as required. 

In this example we employed the bilinear form y, z . The symmetry of


this form with respect to its arguments simplified the calculation. In the
static problems of linear elasticity, such a form is naturally induced by the
energy expression for an elastic body. Moreover, the form of the left sides of
(1.52) is the same for all such problems, whether they are three-dimensional
problems of elasticity, or problems describing elastic beams or shells.
In Ritz’s time such approximate solutions were sought for problems de-
scribing elastic beams and plates. The resulting systems of equations were
fairly hard to solve by hand. The method was justified by comparison
with experimental data. A full justification of Ritz’s and similar methods
requires the tools of functional analysis, which forms the subject of Chap-
ter 4. However, we would like to discuss some aspects of the method on an
elementary level using Example 1.18 as a model.

Notes on basis functions


First let us comment on the approximations. The normal working viewpoint
is that one compares each pair of successive approximations and terminates
the calculation process upon reaching a pair whose difference is less than
some predetermined tolerance ε.
For each type of approximation, if we appoint ε = 0.01 then we can stop
at k = 5. Calculation out to k = 10 shows that the k = 5 approximations
are both very good. However, they do differ from each other by a maximum
of about 0.25. So which is “more” correct? We can answer this by substitu-
tion into the functional, which gives Ψ(y5 ) ≈ 127.046 and Ψ(z5 ) ≈ 127.449.
This is evidence that polynomial approximation is preferable. It is not hard
to see why: the true solution is not oscillatory, so the oscillatory behavior
of the trigonometric polynomials is not helpful in this case. So the “practi-
cal” approach to terminating the numerical process may not work well for
trigonometric approximation. In this particular example it can be shown
that the trigonometric approximations do converge, but slowly.
We have selected the polynomial-type Ritz approximations. But our ob-
servation regarding trigonometric approximations is cause for concern since
the situation with ordinary polynomials should not differ in principle from
that with trigonometric polynomials. Let us further discuss the problem of
basis functions.
In formulating Ritz’s method we required completeness of the set of
September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

30 Advanced Engineering Analysis

basis functions. Weierstrass’s theorem of calculus states that any function


f (x) continuous on [0, 1] can be approximated uniformly by a polynomial
to within any accuracy. In other words, given ε > 0 there exists an nth
order polynomial Pn (x) such that
max |f (x) − Pn (x)| < ε.
x∈[0,1]

It follows that to within any accuracy we may use a polynomial to uniformly


approximate a function f (x) together with its continuous derivative. In-
deed, given ε > 0, we begin with approximation of the derivative f  (x) by
a polynomial Qn (x):
max |f  (x) − Qn (x)| < ε/2.
x∈[0,1]

The polynomial
x
Pn (x) = f (0) + Qn (t) dt
0

approximates f (x):
 
 x x 
|f (x) − Pn (x)| = f (0) + 
f (t) dt − f (0) − Qn (t) dt
0 0
x
≤ |f  (t) − Qn (t)| dt
0
≤ ε/2 for x ∈ [0, 1].
In the same way it can be shown that a function n-times continuously dif-
ferentiable on [0, 1] can be approximated to within any prescribed accuracy
by a polynomial together with all n of its derivatives on [0, 1]. The set of
monomials {xk } constitutes a complete system of functions in C (n) [0, 1] for
any n.
Note that Weierstrass’ theorem guarantees nothing more than the exis-
tence of an approximating polynomial. When we decrease ε we get a new
polynomial where the coefficient standing at each term xk may differ sig-
nificantly from the corresponding coefficient of the previous approximating
polynomial. This is because the set {xk } does not have the uniqueness
property required of a true basis. Moreover, in mathematical analysis it
is shown that we can arbitrarily remove infinitely many members of the
family {xk } and still have a complete system {xkr }. It is necessary only
∞
to retain such members of the family that the series r=1 1/kr diverges.
So the system {xk } contains more members than we need. Although any
finite set of monomials xk is linearly independent, as we take more and
September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

Basic Calculus of Variations 31

more elements the set gets closer to becoming linearly dependent; that is,
given any ε > 0 we can find infinitely many polynomials approximating the
zero function to within ε-accuracy on [0, 1]. This leads to numerical insta-
bility. The difficulty can be avoided by using other families of polynomials
for approximation: namely, orthogonal polynomials for which numerical
instability shows itself only in higher degrees of approximation.
As we know from the theory of Fourier expansion, the second system
of functions {sin kπx} is orthonormal. It is, moreover, a basis (but not
(1)
of C0 (0, π)) as we shall discuss later. This provides greater stability in
calculations to within higher accuracy. However, in low-order Ritz approx-
imations it can be worse than a polynomial approximation of the same
problem, at least for many problems whose solutions do not oscillate.
One more aspect of the approximation is seen in the above results. For
Ritz’s approximations we compared their values. Comparing the values of
their derivatives, we find that much better agreement is obtained for the
values of the approximating functions than for the derivatives. It is obvious
that the same holds for the difference between an exact solution and the ap-
proximating functions. This property is common to all projection methods.
So, for example, in solving problems of elasticity we get comparatively good
results in low-order approximations for the field of displacements, whereas
the fields of stresses, which are expressed through the derivatives of the
displacement fields, are approximated significantly worse.

1.5 Natural Boundary Conditions

In § 1.1 we found that by using discretization on the problem of minimum of


the functional (1.33) without boundary conditions (“with free boundary”)
we obtain the Euler equation and some boundary conditions. We shall
demonstrate that the same boundary conditions appear by the method of
§ 1.2. They are known as natural boundary conditions.
Consider the minimization of (1.33) when there are no restrictions on
the boundary for y = y(x).

Theorem
b 1.19. Let y = y(x) ∈ C (2) (a, b) be a minimizer of the functional
 (1)
a f (x, y, y ) dx over the space C (a, b). Then for y = y(x) the Euler
equation

d
fy − fy  = 0 for all x ∈ (a, b) (1.53)
dx
September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

32 Advanced Engineering Analysis

holds along with the natural boundary conditions


 
fy x=a = 0, fy x=b = 0. (1.54)

Proof. We can repeat the initial steps of § 1.2. Namely, consider the
values of the functional on the bundle of functions y = y(x) + tϕ(x) where
ϕ(x) ∈ C (1) (a, b) is arbitrary but fixed. Here, however, there are no restric-
tions on ϕ(x) at the endpoints of [a, b]. b
For fixed y(x) and ϕ(x) the functional a f (x, y + tϕ, y  + tϕ ) dx be-
comes a function of the real variable t, and attains its minimum at t = 0.
Differentiating with respect to t we get
b
[fy (x, y, y  )ϕ + fy (x, y, y  )ϕ ] dx = 0.
a
Integration by parts gives
b   x=b
d 

fy (x, y, y ) − fy (x, y, y ) ϕ dx + fy (x, y(x), y (x))ϕ(x)
 
= 0.
a dx x=a
(1.55)
From this we shall derive the Euler equation for y(x) and the natural bound-
ary conditions. The procedure is as follows. We limit the set of all continu-
ously differentiable functions ϕ(x) to those satisfying ϕ(a) = ϕ(b) = 0. For
these functions we have
b 
d
fy (x, y, y  ) − fy (x, y, y  ) ϕ dx = 0. (1.56)
a dx
This equation holds for all functions ϕ(x) that participate in the formulation
of Lemma 1.8. Hence the continuous multiplier of ϕ(x) in the integrand of
(1.56) is zero, and the Euler equation (1.53) holds in (a, b).
Now let us return to (1.55). The equality (1.56), because of the Euler
equation, holds for all ϕ(x). From (1.55) it follows that
 x=b

fy (x, y(x), y (x))ϕ(x)

=0 (1.57)
x=a

for any ϕ(x). Taking ϕ(x) = x − b we find that fy |x=a = 0; taking
ϕ(x) = x − a we find that fy |x=b = 0. 
Let us call attention to the way this result was obtained. First we re-
stricted the set of admissible functions to those for which we could get a
certain intermediate result (the Euler equation); using this result, we ob-
tained some simplification in the first variation. We finished the argument
by considering the simplified first variation on all the admissible functions.
September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

Basic Calculus of Variations 33

Natural boundary conditions are of great importance in mathematical


physics. For some models of real bodies or processes it may be unclear which
(and how many) boundary conditions are necessary for well-posedness of
the problem. The variational approach usually clarifies the situation and
provides natural boundary conditions dictated by the nature of the problem.
The bending of a plate is a famous example. For her pioneering studies of
this problem Sophie Germain received a prize from the French Academy
of Sciences. She derived the biharmonic equation for the deflections of the
midsurface of the plate, but with three boundary conditions as seemed to
be in accordance with mechanical intuition; variational considerations later
demonstrated that only two were independent.
It is worth noting that in mechanical problems, the natural boundary
conditions are dual to kinematic conditions on the boundary. They do
not arise at a boundary point when we “clamp” as fully as allowed by
the model. Incomplete clamping at a point always results in a natural
boundary condition of force type there. If no kinematic constraint prevails
at a point, then the natural boundary conditions express the equilibrium of
forces. A simple example is afforded by the stretched rod treated later on;
application of a force F at the right end of the rod results in the natural
boundary condition ES(l)u (l) = F , which means that the cross section
at point l is in mechanical equilibrium under F and the reaction of the
remainder of the rod.

Remark 1.20. In § 1.1 we discussed the question of which boundary con-


ditions can be imposed to get a well-posed boundary value problem for
minimizing the functional (1.33). General considerations are nice; however,
consider the minimization of
1
2
(y  + 2y) dx (1.58)
0
on the set of continuously differentiable functions. Its Euler equation is
y  = 1, thus all the extremals take the form
1
y = x2 + kx + b.
2
The natural boundary conditions are y  (0) = 0, y  (1) = 0. These imply
k = 0. So the problem of minimum of (1.58) (with natural boundary
conditions) has a family of solutions y = 12 x2 + b with arbitrary constant b.
Thus we may impose an additional condition, say y(0) = 2. But in general,
such a third condition for an ordinary differential equation of second order
can yield a boundary value problem that has no solution.
September 30, 2011 8:42 World Scientific Book - 9in x 6in aea

34 Advanced Engineering Analysis

Although (1.58) is simple, the situation we just described is not unim-


portant. Indeed, the same situation holds for the whole class of functionals
that govern the equilibrium states of linear elastic systems in terms of dis-
placements. If we impose no geometrical restrictions on the position of an
elastic body (it is normally the case of natural boundary conditions) we
can always change the coordinate frame, and all the displacements can be
changed in such a way that the body appears to be shifted as a whole (i.e.,
to move as a “rigid body”). Depending on the model of the body there are
apparently one to six free constants describing such a motion — hence we
can impose additional boundary conditions at some points and still preserve
the well-posedness of the problem. In a one-dimensional problem (where
the dimension is a spatial coordinate) the situation is exactly as it is for
(1.58): it is possible to impose an additional boundary condition when con-
sidering the problem with “free” ends. Caution is often warranted when
applying the outcomes of very general considerations. 

1.6 Extensions to More General Functionals

Let us consider two extensions of the above results.

b
The functional a
f (x, y, y ) dx
Let us replace y(x) in (1.33) by a vector function

y(x) = (y1 (x), . . . , yn (x)).

We denote the integrand of the functional as

f (x, y(x), y (x)) or f (x, y1 (x), . . . , yn (x), y1 (x), . . . , yn (x))

interchangeably. The task is to treat functionals of the form


b
F (y) = f (x, y, y ) dx. (1.59)
a

First consider the problem of minimizing (1.59) when y(x) takes bound-
ary values

y(a) = c0 , y(b) = c1 , (1.60)

with vector constants c0 = (c01 , c02 , . . . , c0n ), c1 = (c11 , c12 , . . . , c1n ). We


take y(x) ∈ C (k) (a, b) to mean that each coordinate function yi (x) ∈
Another Random Document on
Scribd Without Any Related Topics
on the vacuity of the space between the metallic atoms is
groundless.
1788. Although the space occupied by the hydrated oxide of
potassium comprises 2800 ponderable atoms, while that occupied by
an equal mass of the metal comprises only 430, there may be in the
latter proportionally as much more of the material, though
imponderable, powers of heat and electricity, as there is less of
matter endowed with ponderability.
1789. Thus, while assuming the existence of fewer imponderable
causes than the celebrated author of the speculation has himself
proposed, we explain the conducting power of metals, without being
under the necessity of attributing to void space the property of
electrical conduction. Moreover, I consider it quite consistent to
suppose that the presence of the ethereal basis of electricity is
indispensable to electrical conduction, and that diversities in this
faculty are due to the proportion of that material power present, and
the mode of its association with other matter. The immense
superiority of metals will be explained, by referring it to their being
peculiarly replete with the ethereal basis of heat and electricity.
1790. Hence Farraday’s suggestions respecting the materiality of
what has heretofore been designated as the properties of bodies,
furnish the means of refuting his arguments against the existence of
ponderable impenetrable atoms as the basis of cohesion, chemical
affinity, momentum, and gravitation.
1791. But I will, in the next place, prove that his suggestions not
only furnish an answer to his objections to the views in this respect
heretofore entertained, but are likewise pregnant with consequences
directly inconsistent with the view of the subject which he has
recently presented.
1792. I have said that of all the powers which are, according to
Farraday’s speculations, to be deemed material, gravitation can
alone be ponderable; since, according to his speculations,
gravitation, in common with every power heretofore attributed to
impenetrable particles, must be a matter independently pervading
the space throughout which it is perceived. This being the
consequence, by what tie is gravitation, or, in other words, weight—
indissolubly attached to the rest? It cannot be pretended that either
of the powers is the property of any other. Each of them is an m,
and cannot play the part of an a, not only because an m, an effect,
cannot be an a, its cause, but because, according to the premises,
no a can exist. Nor can it be advanced that they are the same
power, since chemical affinity and cohesion act only at insensible
distances, while gravitation acts at any and every distance, with
forces inversely as their squares; and, moreover, the power of
chemical affinity is not commensurate with that of gravitation. One
part, by weight, of hydrogen has a greater affinity, universally, for
any other element than two hundred parts of gold. By what means
then are cohesion, chemical affinity, and gravitation inseparably
associated in all the ponderable elements of matter? Is it not fatal to
the validity of the highly ingenious and interesting deductions of
Farraday, that they are thus shown to be utterly incompetent to
explain the inseparable association of cohesion, chemical affinity,
and inertia with gravitation, while the existence of a vacuity between
Newtonian atoms, mainly relied upon as the basis of an argument
against their existence, is shown to be inconsistent both with the
ingenious speculation which has called forth these remarks, and
those Herculean “researches” which must perpetuate his fame? (See
Appendix for Farraday’s Speculations on Electric Conduction and the
Nature of Matter.)

On Whewell’s demonstration that all matter is heavy.

1793. While the speculations of Farraday, isolate gravitation, as


the only matter endowed with weight, and treat all other matters as
weightless, those of another eminent philosopher, Whewell, would
tend to prove that all matter is heavy.
1794. This subject may be interesting now, when we are anxious
to understand well the nature of matter, which Comte would
represent as the basis of mind, and when it becomes a point of
departure in forming ideas of spirit and mind, as they must be
contemplated by Spiritualism. I therefore subjoin a critique upon the
allegation that all matter can be heavy, and on the relation between
vis inertiæ and gravitation.
1795. One consideration seems to be usually overlooked in
contemplating these forces. It is forgotten that inertia is the property
of one body, while gravitation requires two for its existence. If there
were only one body in nature, it might move on, in obedience to its
vis inertiæ, for any length of time; but, during an isolated existence,
could neither attract nor be attracted. Whewell’s theorem, in his own
language, is as follows:
1796. “We see,” alleges Whewell, “that the propositions
that all bodies are heavy, and that inertia is proportional to
weight, necessarily follow from those fundamental ideas
which we unavoidably employ in all attempts to reason
concerning the mechanical relations of bodies.” (See
Demonstration that all Matter is heavy, by the Rev. William
Whewell, B.D. Silliman’s Journal, vol. 42, page 265.)
To Professor Whewell:
1797. Dear Sir: I thank you for your kind attention in sending me a
copy of your pamphlet, entitled a “Demonstration that all Matter is
heavy,” comprising a communication made to the Cambridge
Philosophical Society.
1798. I conceive that to demonstrate that all matter is heavy, is, in
other words, to prove that all matter is endowed with attraction of
gravitation, or that general property which, when it causes bodies to
tend toward the centre of the earth, is called weight. Hence to
assert that all matter is heavy, is no more than to say, that attraction
of gravitation exists between all or any masses of matter.
1799. You say, “it may be urged that we have no difficulty in
conceiving of matter which is not heavy.” I have no hesitation in
asserting that there should be no difficulty in entertaining such a
conception; since I cannot understand why any two masses may not
be as readily conceived to repel, as to attract each other, or neither
to attract nor to repel. Is it not easier to imagine two remote masses
indifferent to each other, than that they act upon each other? Is any
thing more difficult to understand than that a body can act where it
is not?
1800. It is also mentioned by you, that it may be urged “that
inertia and weight are two separate properties of matter.” Now I will
not only urge, but also, with all due deference, will undertake to
show, that the existence of inertia may as well be proven, and its
quantity estimated, by means of repulsion as by means of attraction.
1801. Suppose two bodies, A and B, to be endowed with
reciprocal attraction, or, in other words, to gravitate toward each
other. Being placed at a distance, and then allowed to approach, if,
after any given time, it were found that they had moved severally
any ascertained distances, evidently their relative inertias would be
considered as inversely as those distances.
1802. In the next place, let us suppose two bodies, X and Y,
endowed with the opposite force of reciprocal repulsion, to be placed
in proximity, and then allowed to fly apart. The distances run
through by them severally, being, at any given time, determined,
might not their respective inertias be taken to be inversely as those
distances; so that the question would be as well ascertained in this
case as in that above stated, in which gravitation should be resorted
to as the test?
1803. It seems to me that this question is sufficiently answered in
the affirmative, in your second paragraph, page 7, (p. 269,) in which
you allege, that “one body has twice as much inertia as another, if,
when the same force acts upon it for the same time, it acquires but
half the velocity. This is the fundamental conception of inertia.”
1804. In the third paragraph, fourth page, (p. 261,) you say, “that
the quantity of matter is measured by those sensible properties of
matter which undergo quantitative addition, subtraction, and
division, as the matter is added, subtracted, or divided, the quantity
of matter cannot be known in any other way; but this mode of
measuring the quantity of matter, in order to be true at all, must be
true universally.”
1805. Also your fourth paragraph, fifth page, (p. 268,) concludes
with this allegation: “And thus we have proved, that if there be any
kind of matter which is not heavy, the weight can no longer avail us,
in any case, to any extent, as the measure of the quantity of
matter.”
1806. In reply to these allegations, let me inquire, Cannot a
matter exist of which the sensible properties do not admit of being
measured by human means? Because some kinds of matter can be
measured by “those sensible qualities which undergo quantitative
addition, subtraction, and division,” does it follow that there may not
be matter which is incapable of being thus measured? And
wherefore would the method of obtaining philosophical truth be
“futile” in the one case, because inapplicable in the other? Because
the inertias of A and B have been discovered, by means of their
gravitation, does it follow that the inertias of X and Y cannot be
discovered by their self-repellent power? Why should the
inapplicability of gravitation in the one case render its employment
futile in the other?
1807. It is self-evident, that matter without weight cannot be
estimated by weighing, but I deny that on that account such
weightless matter may not be otherwise estimated. The inertias of A
and B cannot be better measured by gravitation than those of X and
Y by repulsion, as already shown.
1808. You seem to infer, in paragraph second, page sixth, (p.
268,) that we should be equally destitute of the means of measuring
matter accurately, “were any kind of matter heavy indeed, but not so
heavy, in proportion to its quantity of matter, as other kinds.”
1809. If, in the case of all matter, weight be admitted to be the
only measure of quantity, it were inconsistent to suppose any given
quantity of matter, of any one kind, to have less weight than an
equal quantity of another kind; but upon what other than a
conventional basis is it to be assumed that there is more matter in a
cubic inch of platinum than in a cubic inch of tin? in a cubic inch of
mercury than in a cubic inch of iron? Judging by the chemical
efficacy of the masses, although the weight of mercury is to that of
iron as 13.6 is to 8, there are more equivalents of the latter than the
former in any given bulk, since by weight twenty-eight parts of iron
are equivalent to two hundred and two parts of mercury.
1810. Weight is one of the properties of certain kinds of matter,
and has been advantageously resorted to, in preference to any other
property, in estimating the quantity of the matter to which it
appertains. Nevertheless, measurement by bulk is found expedient
or necessary in many cases. But may we not appeal to any general
property which admits of being measured or estimated? Farraday
has inferred that the quantity of electricity is as the quantity of gas
which it evolves. Light has been considered as proportional in
quantity to the surface which it illuminates with a given intensity at a
certain distance. The quantity of caloric has been held to be directly
as the weight of water which it will render aeriform; and has also
been estimated by the degree of its expansive or thermometric
influence. What scale-beam is more delicate than the thermoscope
of Melloni?
1811. In the last paragraph but one, seventh page, (p. 270,) you
suggest, that “perhaps some persons might conceive that the
identity of weight and inertia is obvious at once, for both are merely
resistance to motion; inertia, resistance to all motion, or change of
motion; weight, resistance to motion upward.”
1812. I am surprised that you should think the opinion of any
person worthy of attention, who should entertain so narrow a view
of weight, as antagonist of momentum, as that above quoted, “that
it is a resistance to motion upward.” Agreeably to the definition given
at the commencement of the letter, weight, in its usual practical
sense, is only one case of the general force which causes all
ponderable masses of matter to gravitate toward each other, and
which is of course liable to resist any conflicting motion, whatever
may be the direction. When, in the form of solar attraction, it
overcomes that inertia of the planets which would otherwise cause
them to leave their orbits, does gravitation “resist motion upward?”
1813. In the next paragraph you allege, that “there is a difference
in these two kinds of resistance to motion. Inertia is instantaneous,
weight is continuous, resistance.”
1814. It is to this allegation I object, that as you have defined
inertia to be “resistance to motion, or to change of motion,” it
follows that it can be instantaneous only where the impulse which it
resists is instantaneous. It cannot be less continuous than the force
by which it is overcome.
1815. Gravity has been considered as acting upon falling bodies by
an infinity of impulses, each producing an adequate acceleration; but
to every such accelerating impulse, producing of course a “change of
motion,” will there not be a commensurate resistance from inertia?
and the impulses and resistances being both infinite, will not one be
as continuous as the other?
1816. I have already adverted to inertia as the continuous
antagonist of solar attraction in the case of revolving planets.
1817. Agreeably to Mossotti, the creation consists of two kinds of
matter, of which the homogeneous particles are mutually repellent,
the heterogeneous mutually attractive. Consistently with this
hypothesis, per se, any matter must be imponderable; being
endowed with a property the very opposite of attraction of
gravitation. This last-mentioned property exists between masses
consisting of both kinds of particles, so far as the attraction between
the heterogeneous atoms predominates over the repulsion between
those which are homogeneous. It would follow from these premises,
that all matter is ponderable or otherwise, accordingly as it may be
situated.
1818. Can the ether by which, according to the undulatory theory,
light is transmitted, consist of ponderable matter? Were it so, would
it not be attracted about the planets with forces proportioned to
their weight, respectively? and becoming of unequal density, would
not the diversity in its density, thus arising, affect its undulations, as
the transmission of sound is influenced by any variations in the
density of the aeriform fluid by which it is propagated?
With esteem, I am yours truly,
Robert Hare
(See appendix for Whewell’s Essay.)

Additional Remarks on the Speculations of Farraday and


Exley, above noticed.

1819. Is it possible for a mere centre to be endowed with a force?


or reasonable that language should not make a distinction between
something and nothing, between cause and effect, between matter
and the properties of matter? m being the properties, and a the
Newtonian atom, of which they have been considered as the
attributes, I cannot concur in the reasoning which infers that where
we can only perceive phenomena, we are to dispense with the idea
of causation, because that causation is not directly perceptible. It
seems to me, from the meaning of the words, that no cause can
exist without some effect, nor can any effect exist without a cause.
Language founded on the existence of ideas cannot be disused. Can
there be any reason for considering any thing as endowed with
existence which gives no evidence of existence? We distinguish
between the thing which causes and the effect which it produces.
The cause evidently has a centrality; the effect, though it indicates
by the direction in which it arrives, the centre whence it proceeds, is
remote from that centre. The existence of this centrality seems to be
recognised in the suggestion that atoms are centres of forces. This
implies that the source or cause is at the centre in each atom, and,
of course, the phenomenon, being more or less remote from the
centre, cannot be the source or cause, and hence has been treated
as an effect or property.
1820. The suggestion that the office of atoms may be performed
by centres of forces, in fact, assigns to a mere centre the part now
performed by a Newtonian atom. But it must be evident that the
centre is that point within any rotating mass, which does not turn
therewith; and which, where neither of the opposite motions
resulting from rotation take place, can neither have length nor
breadth. This reduces the idea of a centre to a common definition
with a mathematical point; which is nihility in the extreme. An
absolutely void space may be identified with nihility, and a
mathematical point is a portion of that space, without length,
breadth, or thickness. To endow centres with forces is to disregard
the axiom, “Out of nothing nothing can come.” Moreover, wherefore
should there be a force at certain mathematical points, and yet
others be destitute of the same attribute? Manifestly, if some
mathematical points are deficient of powers with which others are
endowed, there must be something associated with one, which is
not associated with the other. This justifies the Newtonian idea, that
the force, though proceeding from the centre, is, like the terrestrial
attraction of gravitation, the resultant of the complicated attraction
of the whole of a body surrounding the centre. But the centrality of
the force does not seem to accord with the idea of the inferred
diffusion of properties. In the instance of gravitation it does not
account for those attributes by which this globe acts as a solid mass
within its material superficies, and yet, according to the Farradian
definition, reaches beyond the moon!
1821. But the idea of that polarity, of which Farraday has done so
much to establish the existence in all matter, in one form or another,
seems to involve that, to constitute atoms, there must be two
centres of analogous, but opposite, forces in each: whence it ensues
that crystals shoot in prisms or spiculæ, as water is seen to shoot in
freezing; and through which salts, as deposited by the evaporation
of the solvent from a solution of them, are seen to travel over the
sides of the vessel; and upon which property the phenomena of
electricity and magnetism appear to be dependent. How is this to be
reconciled with this notion of each atom existing in a diffusible
penetrable state throughout the space in which its properties
prevail? Since these opposite polarities are energetic in their
reciprocal polar attraction, what keeps them together, yet prevents
them from so uniting as to produce neutralization?
1822. Mr. Exley’s ideas, if admitted, leave no alternative but either
to place a Newtonian atom within each of his concentric spheres, or
to assume that nothing can have properties, or that effects can exist
without causes. What is to cause a force at any mathematical point
more than at any other? How, in case of a moving body, are the
forces to appear successively to proceed from various centres, if
there be nothing in which it is inherent, which moves and carries its
forces or properties wheresoever it goes? Does not this suggestion
that atoms are centres of their forces, by making the cart draw itself,
force the effect to be its own cause? It is quite consistent with the
Newtonian definition, that the resultant of the action of every part of
a mass should comport as if it proceeded from a common centre, as
does terrestrial gravitation; and of course, whether we have the
Newtonian idea or that of Boscovitch, Farraday, or Exley, we have
forces proceeding from centres. The great difference is that
agreeably to the one these forces emanate from nothing; agreeably
to the other, from something. I used to define matter to my pupils as
that which has properties. In the mind, is not force distinguished
from some moving power which gives it rise? Is not this distinction
inevitable? and were the word force employed to designate the
moving power which exercises force, would it not confound ideas,
without altering the actual state of the case? Would it not impoverish
language, without improving science?

Of Mundane, Ethereal, and Ponderable Matter, in their


Chemical relations.

1823. The bodies which occupy the attention of a chemist are


found in one of three states—those of solidity, fluidity, and elasticity.
Ice, liquid water, and steam exemplify these different states. The
fact is thus illustrated, that the same chemical compound, consisting
of oxygen and hydrogen, may exist in either state, according to the
temperature to which it may be subjected.
1824. Experience justifies the surmise, that scarcely any body in
nature is utterly insusceptible of these three states, provided it were
heated or refrigerated with an unlimited power.
1825. Beside the property of gravitation, of which the energy is
inversely as the square of the distance, however great, (as when it
enables the two suns, apparently forming but one—the double star,
61 Cygni (1340)—at the distance of six thousand millions of miles, to
attract each other so as to revolve about their common centre of
gravity,) atoms are endowed with a force called attraction of
aggregation, which operates only at insensible distances, so that
when brought into due proximity they unite and form a coherent
mass. Again, they are endowed, as already mentioned, with
chemical affinity, which varies with the kind of particles in which it
exists as a property; being the characteristic by which they are
distinguished one from the other.
1826. According to the doctrine which chemists have heretofore
suggested for the existence of matter in the elastic or gaseous state,
each aerial or gaseous atom was conceived to be enveloped in an
atmosphere of fluid called caloric, resembling the ether in the self-
repellent power of its constituent particles. This atmosphere has
been assumed to impart to atoms which it envelopes its own
inherent power of reciprocal repulsion, like that which those of the
ether have. But Dalton showed that there was no repulsion between
gaseous atoms when heterogeneous. Two or more such gases,
hydrogen and nitrogen, for instance, being comprised in the same
cavity, there would be no repulsion between the atoms of hydrogen
and those of nitrogen, but only between those of the same gas. This
has been held to be equally true, however many gases might be
mingled, or whatever vapours might be superadded.
1827. The idea is thus refuted, which ascribes the repulsive power
to the same elastic fluid, since in that case the diversity of the
gaseous atoms could not so affect the repulsive influence as to
nullify it between heterogeneous atoms, while sustaining this
repulsion, where the atoms should be alike.
1828. Moreover, as the rays of light have been found to be mere
undulations in the ether; the rays of heat, being perfectly analogous
in their attributes, must also be due to ethereal undulations. But
vaporization may be affected by radiant heat, and gases owe their
aeriform state to the same cause as vapor or steam; yet transient
undulations evidently cannot form a permanent combination, so as
to confer the durable elasticity of a permanent gas.
1829. It appears, then, that neither the doctrine of caloric, nor the
undulatory doctrine, as it is received, will explain the creation of
permanent gas. Under these circumstances a modification of the
existing opinions is called for. It has, for some years, occurred to me,
that the Newtonian doctrine of radiation might be associated with
that of undulation.
1830. The fact that radiant heat could be collected by a mirror so
as to raise the temperature of bodies placed in the focus, and that
this process could take place in vacuo, as ascertained by Sir
Humphrey Davy, had been adduced as unquestionable evidence of
the materiality of caloric, the supposed fluid cause of heat. But as
the cold proceeding from a snowball or any cold body could be
collected by the same process, it was urged by some chemists that
the evidence of the materiality of the cause of cold must also be
admitted. Prevost met this argument by suggesting that no body in
nature is absolutely cold. Every body, however refrigerated, is not so
cold as to be incapable of greater refrigeration. Hence all bodies
being absolutely above the zero of nature, are throwing off rays to
each other, and where there is equality of temperature, they do not
cause any change in their relative temperatures. The rays thrown off
by A are compensated by those which it receives from B, and vice
versa. But if A throws off to B more than B reciprocates, the
temperature of A must fall until an equilibrium is attained. Thus, A
being the mirror and B the snowball, the mirror is refrigerated, and
causes a greater radiation from any body situated about its focus.
This explanation was generally received, but to me, the following
rationale, which I advanced, appeared preferable:
1831. I assumed caloric to exist throughout the sublunary
creation, as the luminiferous ether is assumed to be diffused
throughout all space by the undulationists; the diffusion arising from
the reciprocal repulsion of its particles being similar to that which
had been supposed to cause the diffusion of caloric. There is the
greatest analogy between this diffusion and that which is known to
exist in the case of gases. The process is the same, whether the gas
be dense like chlorine, or thirty-six times as rare, as in the instance
of hydrogen, and in the luminiferous ether resembles the process by
which hydrogen is rarified, or might be rendered more rare, were the
pressure of the atmosphere removed.
1832. It is known that in any gas or gaseous mixture like that
which we breathe, if a deficit of pressure be caused in any spot, the
gaseous particles will quickly move toward it, in order to restore the
equilibrium of pressure, and that if, on the other hand, any
augmentation of pressure be produced at any spot, the gas will
move outward to restore the equilibrium.
1833. The particles being symmetrically arranged in lines, a row of
particles may be conceived to lie between every two remote points.
If we suppose any number of points in the focal body, and a
corresponding number in the surface of the mirror, it may be
conceived that the intervening ethereal or calorific particles will
move in rows one way or the other, as the pressure in the focal
space may become greater or less. Thus an effect is brought about,
equivalent to that which the Newtonian idea of radiation involves;
lines of particles proceed from the hotter points to the colder ones.
1834. The arrangement of the particles of caloric, which was
originally, in my view, confined to the sublunary creation, appears of
necessity to belong to the luminiferous ether, required by the theory
ascribing light to undulations, though the last-mentioned medium
must be endowed with ubiquity as above stated, so as to abound in
every part of space through which light reaches the eye.
1835. The undulatory hypothesis supposes that a wave-like
motion being imparted to a row of particles, by a luminous point in
the surface of the luminous body, is transmitted, like the sound
producing waves in the air, to the other end of the row.
1836. This undulatory progression has been roughly illustrated by
the transitory serpentine movements which may be made in a cord,
stretched like a clothes-line between the tops of posts.
1837. In order to make this illustration elucidate the conception
which I advance, we have only to suppose that the cord, instead of
being attached to the post, should be drawn rapidly over pulleys,
and, while thus actuated, be subjected to a cause of undulatory
vibration. It may be conceived that, by this process, the ethereal
particles, while performing all which the undulatory theory requires,
might at the same time perform all required by that of emission and
material calorific radiation. Directed upon a vaporizable liquid, the
undulations might perform the part of sensible heat; the ethereal
particles, successively combining, might furnish the latent heat
requisite to the constitution of vapour.
1838. Agreeably to Newton, the seven colours of the spectrum are
due to as many different kinds of radiant particles of various
refrangibility, or susceptibility of being bent from the rectilinear path
when passed through the same refracting medium.[40]
1839. According to the undulatory theory, the colours are caused
by diversities in the undulations producing them. Retaining this
feature, the last-mentioned hypothesis, as modified by myself,
appears to be competent to explain the phenomena of light as well
as those of vaporization, produced by calorific radiation, since not
only is any vaporizing liquid subjected to the transient effect of the
undulations, but also may combine with the ethereal particles as
they come into contact with it.
1840. Thus modified, the rationale of the rainbow, or prismatic
spectrum, would not be that the colours indicate as many varieties
of original radiant particles, but that they are to be explained
agreeably to the undulatory hypothesis, which ascribes them to as
many varieties in the undulations, just as the notes in music are
ascribed to diversities of vibration.
1841. The ether, under this view, performs the part heretofore
assigned to latent heat, by combining with solids so as to render
them susceptible of expansion, and of electrical conduction by being
liable to the polarization which constitutes electricity.
1842. Sensible heat, according to this aspect, is due to the
vibrations of the ethereal fluid, which is sustained by the sun, by
ignition in the interior of the earth, and by chemical reaction,
including combustion and respiration.
1843. The correctness of the inference, that conductors owe their
conductive power to ethereal matter entering into their composition,
has been insisted upon in my strictures on Farraday’s speculation in
some of the preceding pages. The facts admitted by this
distinguished investigator of nature’s laws, gave to me a basis on
which to rest an argument in favour of the existence of an
imponderable cause of heat and electricity in metals, which seems to
me unanswerable.
1844. Agreeably to the hypothesis respecting which the preceding
preparatory suggestions have been made, gasification is not due to a
repulsive atmosphere of ethereal matter, severally appropriated to
each ponderable constituent atom, but to an attraction for every
such atom exercised by the ethereal fluid, such as water exercises
toward sugar, quick-lime, salt, or any soluble substance. The ether
attracts the particles of certain solids, and is of course reacted upon
by them. The particles thus attracted naturally distribute themselves
throughout it, at symmetrical distances. Hence the law of Pettit and
Dulong is verified, which, at least, holds good with all gasifiable
atoms, that their capacity is inversely as their atomic weight.
1845. The atomic weights of hydrogen, nitrogen, and chlorine
being severally 1, 14, 36, when associated with equal volumes of the
imponderable ether, they will have still the same weight. Equal
volumes will weigh the same as the atoms with which they are
associated; and the capacity for heat, being directly as the volumes,
will be inversely as the weights, the calculation being the same,
whether ether or caloric be the imponderable principle to which they
owe their gasification. By concurring with those chemists, who
estimate the atoms of oxygen at 16, instead of 8, this gas will come
into the same calculation.
1846. When heterogeneous gases are confined within the same
cavity, that they should not react with each other is no more
wonderful, than that the same mass of water may at the same time
hold different substances in solution, which may add to its
hydrostatic pressure though they have no reciprocal reaction.
1847. Sensible heat appears to be due to vibrations in the ether,
kept up by the solar rays or central ignition within this globe. By the
heat thus acquired the self-repellent power of the ether is
augmented. When by refrigeration this source of repulsion is
diminished beyond a certain limit, the atoms of certain vaporizable
particles, such as those of steam and other condensible vapours, are
approximated sufficiently to attract each other, and consequently
coalesce and are condensed.
1848. It follows that light is due to undulation, sensible heat to
vibration, and electricity to the polarization caused in the ethereal
medium, while either in a free, or in a combined state. Thus this
luminiferous ether performs the part heretofore attributed to latent
heat or caloric in one state; in another state, that of sensible heat.

Suggestions of Massotti, respecting the Nature of Matter.

1849. Massotti has suggested that all bodies consist of two kinds
of ultimate particles; that any two or more particles of one kind are
repulsive of each other, while any two or more of different kinds are
reciprocally attractive. Hence atoms are formed, consisting of one
atom of one kind and one of the other kind. Of course, were the
opposite forces exercised by the heterogeneous and homogeneous
equal, the resulting atoms would be neither attractive nor repulsive;
but assuming the attractive power to have the ascendency, the
hypothesis would account for the property of gravitation.
1850. Let the suggestions of Massotti be modified, so far as that
the extremities of each particle, whether of one or the other kind,
are to be considered as endowed with opposite polarities, like those
of the magnetic needle, as already suggested in the case of matter
in general. Then in one relative position of the extremities they may
be reciprocally repulsive, in the other reciprocally attractive; likewise
one of the kinds of matter, like the light-producing ether of the
undulationists, may pervade the universe, and be condensed in a
peculiarly great quantity within perfect conductors: all this being
premised, it may be conceived how the waves of opposite
polarization, which proceed from oppositely electrified, or in other
words, oppositely polarized bodies, cause the matter through which
they pass to be decomposed or explosively rent.
1851. As elsewhere stated, in large bodies of water, waves are the
effect of transference of motion successively from one part of the
mass to the other; the rolling of the wave causing nothing to pass
but the motion, and of course, the momentum is invariably
consequent to motion. The waves by which sound is transmitted, are
analogous; nothing being transferred excepting a vibration of the air,
capable of affecting the tympanum of the ear with the impression
requisite to create in the sensorium the idea of sound.
1852. Any affection of matter, capable of existing in successive
parts of a material body, so that while the body is stationary, the
affection passes from one part of the mass to others, may be
considered as a wave of that affection, as reasonably as the
affection called momentum is considered as producing a wave in
water, when passing through it, as above described. It is in this way
that I consider that the term wave of polarization may be applied to
an affection of matter consisting of an abnormal position of the poles
of the constituent particles, successively induced in rows of atoms,
so as to proceed from one part of the series to the other.
1853. And as two sets of waves, of which the hollows of one
should correspond with the elevations of the others, would, by being
associated, produce an even surface and equalization of the
momentum in the aqueous liquid, so, in opposite polarities, there
might be reciprocal neutralization by the coming together of the
polarities.

On Electro-polarity as the Cause of Electrical Phenomena.

1854. Agreeably to the view which I take of the present state of


our electrical knowledge, the phenomena designated under the
name of electricity are due entirely to a process which I designate as
polarization, and the consequences thereof. Those attractions and
repulsions which have been found to exist between particles of
matter, instead of being an endowment of the whole mass of each
particle, seem confined, as already suggested, to particular
terminations or spots, as we see this property on a larger scale in
the loadstone or natural magnet. In the body long known under this
appellation, the attractive power which it exercises is displayed
usually at two distinct portions of its superficies, which are called
poles. When a piece of steel wire is duly rubbed by either of these
poles, it acquires a similar attractive polarity, which always appears
at the extremities. When formed into an appropriate shape and
freely suspended, such a wire magnet constitutes the compass
needle, having the wonderful and all-important faculty of arranging
itself within a meridian plane, so as to be always nearly north and
south; the same pole invariably pointing in the same direction. The
poles are named from the quarter to which they point, one being
called the north pole of the needle, the other the south pole. This
involves that the north pole of the earth itself has nominally south
polarity; the south pole, north polarity.
1855. When two suspended compass needles are sufficiently
approximated, it will be seen that between the poles which point in
the same direction, there is repulsion; between those which point in
different directions, attraction. When the dissimilar poles are brought
into contact, they adhere; and if left cohering, will continue attached
for any length of time; and while in that state of coherence, the
magnetic power of the poles thus touching, being neutralized,
disappears.[41]
1856. If two needles be laid parallel, an interval between them,
the extremities being made to communicate by applying two wires of
suitable dimensions, also parallel to each other, the magnetic power
will be neutralized.
1857. It is inferred that analogous phenomena take place in the
particles of masses or surfaces which are endowed with chemical
affinity or even cohesive attraction.
1858. It is to the existence of the power by which these effects
are caused, at opposite terminations, that bodies, in congealing or
freezing from the state of liquidity, shoot into prismatic, oblong,
regular forms, called crystals. This is illustrated in the formation of
ice, which is seen to shoot into such prismatic crystals.
1859. When a pane of glass is so situated as to have the focus of
a solar microscope thrown upon any spot, so that the glass thus
affected may be between the eye of an observer and the
microscope, any small crystals formed are greatly magnified. Hence
if the focal space be moistened with a solution of certain salts, the
solvent evaporating, crystallization ensues, and is seen to form
appropriate figures for each salt employed. It is owing to this
property that when certain solutions of various substances are
evaporated, the soluble solid, as it is deposited from the solvent,
arranges itself longitudinally; one atom attaching itself to the pole of
another, until it creeps over the sides of the vessel in great quantity.
The appearance of arborescence in certain minerals is thus
accounted for. When an amalgam of mercury with silver is hung by a
platina wire within a bottle of a solution of silver in nitric acid, there
is formed a beautiful branching of silver filaments. These are longer,
though more slowly formed, as the solution is more dilute. In very
dilute solutions I have seen prisms of silver of more than an inch in
length, so delicate, that but for the brilliancy of the surface they
could not have been detected by the eye.
1860. Farraday distinguished two kinds of polarity—ferro-magnetic
and dia-magnetic. That above described as taking place between
steel magnets is designated as ferro-magnetic. Dia-magnetic
particles under magnetic influence take position at right angles to
that which would ensue from ferro-magnetism.
1861. This explanation being premised to enable the student to
comprehend what is meant by polarity, I will proceed to explain
electric phenomena, according to the theory which I hold.
1862. It is expected that the preceding discussions have prepared
the reader to conceive that the atoms of all ponderable matter are
endowed with two analogous but opposite polar powers, which we
term polarity. That in any two atoms the dissimilar polar powers tend
to make them unite, the similar powers having the opposite
tendency. That in any inert mass the opposite powers or polarities
are in contact, and thus reciprocally neutralized.
1863. It will be also understood that the ethereal fluid which
pervades the universe as the means of illumination is assumed to
consist in like manner of atoms or particles which are endowed with
polarity, so that when the opposite poles are in proximity, there is
neutralization: repulsion, and disturbance, when similar poles are
approximated. This being premised, the allegation may be
intelligible, that when bodies are electrified, the poles of the
component atoms or particles are conceived to be deranged from
their natural position of reciprocal neutralization, so that they react
with exterior bodies, disturbing the poles of their constituent
particles, and thus electrifying them by induction.
1864. This abnormal state of disturbance, is conceived to be
produced on glass or resin, or any electric, when duly subjected to
friction.
1865. Thus when in an electric machine a vitreous surface is
rubbed by a leather cushion, the particles both of the leather and
glass surfaces are deranged from their natural state of reciprocal
neutralization, and present their poles in an active state, and the
glass surface, moving through the ethereal medium, (812) polarizes
it as it passes, the ether resuming its normal state till the ethereal
atmosphere over the conductor is reached. To that it imparts durable
polarity; the metallic superficies of the conductor taking the opposite
state, so that the charge is retained until the glass goes to and
returns from the cushion, with a farther supply of polarity.
1866. The charges of polarization received by the plates at each
succeeding revolution of the plate or cylinder, is divided with the
ethereal atmosphere over the conductor, and this process is
reiterated till the frictional power has accomplished its maximum
effect. Then the conductor is said to be charged positively, according
to the theory of one fluid, and vitreously, according to that of Dufay,
or the theory of two fluids. Meanwhile, if the cushion communicates
duly with an insulated conductor, a process perfectly analogous to
that just described has been charging that conductor, pari passu,
with the one first mentioned. By these means we have two excited
or charged conductors.
1867. If, before charging these conquerors, two scalps of hair be
severally situated on them, it will be perceived that, as the charging
proceeds, the hairs on each of the scalps rise, and endeavour to
keep away from each other. But, meanwhile, the whole of the hair
on either is attracted by that on the other conductor. Moreover, on
touching both conductors with any metallic rod, simultaneously, the
whole of the excitement disappears, and the hairs assume their
normal position.
1868. In producing this discharge, iron is not more effective than
any other metal. It is, in fact, known to be less competent for this
species of conduction, than copper, silver, or gold.
1869. When the conductors are excited they have a powerful
effect upon gold leaves, suspended as in the electrometer.
1870. The state of the conductors, when excited, as described
here, is said to be static. Such a state of excitement is distinguished
as a statical charge of electricity.
1871. In the next place, if we procure a horse-shoe magnet, lay it
on a table, cover it with a sheet of paper, and then sift over it iron
filings, we shall see the shape of the magnet delineated upon the
paper, by the filings arranging themselves above its corners in
preference. But as the sifting proceeds, the filings will be seen to
extend themselves in filaments, so as very much to resemble the
electrified hair above described. A tuft of the ferruginous filaments
will be formed upon each pole of the magnet, each filament avoiding
its neighbours, as far as possible. But while each filament, in either
tuft, avoids every other in its appropriate tuft, the whole of the
filaments in one, are attracted by those in the other. Thus, the
charges of polarity which cause each similarly polarized filament to
avoid those in the same state, induce those polarized by one of the
poles of the magnet, to attract such as are polarized by the other
pole of the magnet.
1872. Here is, so far, a great analogy between the phenomena of
the polarization of filings and the polarization of the hair, above
described. But then there is this difference: excepting iron, cobalt,
and nickel, there is no metal which can, by contact with the poles of
a magnet, neutralize the polarity by which the iron filings are
affected; and even these metals produce this result by a process,
the inverse of that by which charges of statical electricity are
neutralized. In fact, the magnetic metal, far from acting as a
discharger, acts as a keeper; and a piece of iron, of a suitable shape,
applied to the terminations of a horse-shoe magnet, prevents the
gradual diminution of the magnetism, which otherwise ensues.
Hence the name keeper is applied to it, as well as armature, derived
from the French.
1873. It will be perceived that, in a steel magnet, the charges are
sustained at the terminations of a conductor, which, as estimated by
Cavendish, conducts electricity with a velocity two hundred thousand
times as great as water.
1874. The charge of the conductor of the machine is superficial, a
gilt globe of glass holding as good a charge as a solid globe of
metal; and, moreover, in this superficial charge, the ether and the air
participate, undergoing a polar affection, analogous to that of the
filings exposed to the influence of the magnet.
1875. On the other hand, in the use of the steel magnet, the
charge is internal, and, other things being equal, increases with the
quantity of iron charged; neither the air nor the ether participate in
this magnetic charge. There is no mode in which the charges of the
poles of a magnet can be made to pass from one to the other,
through any interposed conducting mass.
1876. The retention of the charge seems to be dependent upon a
state of the particles in which they are capable of being deranged
from their normal position with a certain degree of extraneous
influence, and can only resume their natural relative position by a
contrary application of a similar agent. Although steel differs from
iron only in containing, as an ingredient, one-fiftieth of carbon, this
gives it the highly valuable property of hardening, when suddenly
refrigerated; a result which may be accounted for by supposing that,
in consequence of the sudden exposure to a powerful conducting
medium, there is a sort of a jerk by which the particles loose from
their midst an undue portion of their ethereal constituents, and
cannot recover their normal arrangement after the refrigeration.
When this effect is reached to a maximum, the steel is so brittle as
sometimes to fly into two or more pieces when left to itself. When
soft iron is subjected to the magnetizing process, it exchanges one
polarity for the other with such speed, that, in some electro-
magnetic instruments, this reversal is effected more than one
hundred times in a second; but precisely in proportion as the
magnetism is readily received, it is more readily lost. On the other
side, when hardened to a maximum, steel can scarcely be
magnetized at all. Thus, to have a permanent magnet, we must
employ the metal in a state of induration between the extremes.
These facts tend to corroborate the inference that magnetism is
dependent on the relative position of the ferruginous particles. It is
presumed that the ferruginous particles of which the filings consist
indicate, by their direction, as seen externally, the direction in which
the constituent particles of the magnet are situated beneath the
metallic surface.[42]
1877. If to a wire, connecting the poles of a galvanic battery, iron
filings are applied, each ferruginous particle becomes a little magnet,
and displays exactly the same disposition to unite in filaments as has
been represented to take place when they are exposed upon a sheet
of paper, to the influence of a magnet supporting it. But while this
affection is thus identical with that induced by the steel magnet, it
differs therefrom, in its being as transient as the galvanic discharges
to which it owes its existence. These, at the lowest estimate, are
sufficiently rapid to go round the globe in two seconds; whence it
may be conceived that the time taken to percur a few inches of wire
must be almost infinitely brief. Hence, although the filings continue
in a state of magnetization so long as the action of the battery is
sustained, and the wire kept in due contact with the poles of the
battery, it is only by a rapid reiteration of discharges, that this result
is effected.
1878. As the relative position of the particles composing the steel
magnet has been inferred to be indicated by that of the movable
filings which they influence, we may suppose the position of the
particles composing the wire, to be indicated by that which the
filings take by which it is encircled. These are situated always as if
forming tangents to the circumference of the wire, and hence it may
be perceived that the metallic particles, forming the wire, have been
shifted from their normal position, parallel to the axis, so as to take
that tangential direction which the magnetization evinces.
Welcome to Our Bookstore - The Ultimate Destination for Book Lovers
Are you passionate about books and eager to explore new worlds of
knowledge? At our website, we offer a vast collection of books that
cater to every interest and age group. From classic literature to
specialized publications, self-help books, and children’s stories, we
have it all! Each book is a gateway to new adventures, helping you
expand your knowledge and nourish your soul
Experience Convenient and Enjoyable Book Shopping Our website is more
than just an online bookstore—it’s a bridge connecting readers to the
timeless values of culture and wisdom. With a sleek and user-friendly
interface and a smart search system, you can find your favorite books
quickly and easily. Enjoy special promotions, fast home delivery, and
a seamless shopping experience that saves you time and enhances your
love for reading.
Let us accompany you on the journey of exploring knowledge and
personal growth!

ebookgate.com

You might also like