Handbook of statistical analysis and data mining applications Second Edition Elder - eBook PDFinstant download
Handbook of statistical analysis and data mining applications Second Edition Elder - eBook PDFinstant download
https://ebookluna.com/download/handbook-of-statistical-analysis-
and-data-mining-applications-ebook-pdf/
https://ebookluna.com/product/ebook-pdf-handbook-of-statistical-analysis-
and-data-mining-applications-2nd-edition/
https://ebookluna.com/download/predictive-modeling-in-biomedical-data-
mining-and-analysis-ebook-pdf/
https://ebookluna.com/product/ebook-pdf-the-analysis-of-biological-data-
second-edition/
(eBook PDF) Data Mining for Business Analytics: Concepts, Techniques, and
Applications in R
https://ebookluna.com/product/ebook-pdf-data-mining-for-business-analytics-
concepts-techniques-and-applications-in-r/
(eBook PDF) An Introduction to Statistical Methods & Data Analysis 7th
https://ebookluna.com/product/ebook-pdf-an-introduction-to-statistical-
methods-data-analysis-7th/
(eBook PDF) Data Mining for Business Analytics: Concepts, Techniques, and
Applications with JMP Pro
https://ebookluna.com/product/ebook-pdf-data-mining-for-business-analytics-
concepts-techniques-and-applications-with-jmp-pro/
https://ebookluna.com/product/ebook-pdf-data-mining-concepts-and-
techniques-3rd/
(eBook PDF) Data Mining for Business Analytics: Concepts, Techniques, and
Applications with XLMiner 3rd Edition
https://ebookluna.com/product/ebook-pdf-data-mining-for-business-analytics-
concepts-techniques-and-applications-with-xlminer-3rd-edition/
https://ebookluna.com/product/ebook-pdf-data-mining-and-predictive-
analytics-2nd-edition/
HANDBOOK OF STATISTICAL ANALYSIS
AND DATA MINING APPLICATIONS
HANDBOOK OF
STATISTICAL
ANALYSIS AND
DATA MINING
APPLICATIONS
SECOND EDITION
AUTHORS
No part of this publication may be reproduced or transmitted in any form or by any means, electronic or
mechanical, including photocopying, recording, or any information storage and retrieval system, without
permission in writing from the publisher. Details on how to seek permission, further information about the
Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center
and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions.
This book and the individual contributions contained in it are protected under copyright by the Publisher (other
than as may be noted herein).
Notices
Knowledge and best practice in this field are constantly changing. As new research and experience broaden our
understanding, changes in research methods, professional practices, or medical treatment may become necessary.
Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using
any information, methods, compounds, or experiments described herein. In using such information or methods
they should be mindful of their own safety and the safety of others, including parties for whom they have a
professional responsibility.
To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability
for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or
from any use or operation of any methods, products, instructions, or ideas contained in the material herein.
ISBN 978-0-12-416632-5
Note: This list includes all the extra tuto- 8. TUTORIAL “W”—Diabetes Control in
rials published with the 1st edition of this Patients [Field: Medical Informatics]
handbook (2009). These can be considered 9. TUTORIAL “X”—Independent
“enrichment” tutorials for readers of this 2nd Component Analysis [Field: Separating
edition. Since the 1st edition of the handbook Competing Signals]
will not be available after the release of the 10. TUTORIAL “Y”—NTSB Aircraft
2nd edition, these extra tutorials are carried Accidents Reports [Field: Engineering—
over in their original format/versions of soft- Air Travel—Text Mining]
ware, as they are still very useful in learning 11. TUTORIAL “Z”—Obesity Control in
and understanding data mining and predic- Children [Field: Preventive Health Care]
tive analytics, and many readers will want to 12. TUTORIAL “AA”—Random Forests
take advantage of them. Example [Field: Statistics—Data Mining]
List of Extra Enrichment Tutorials that 13. TUTORIAL “BB”—Response
are only on the ELSEVIER COMPANION Optimization [Field: Data Mining—
web page, with data sets as appropriate, for Response Optimization]
downloading and use by readers of this 2nd 14. TUTORIAL “CC”—Diagnostic Tooling
edition of handbook: and Data Mining: Semiconductor Industry
[Field: Industry—Quality Control]
1. TUTORIAL “O”—Boston Housing
15. TUTORIAL “DD”—Titanic—Survivors
Using Regression Trees [Field:
of Ship Sinking [Field: Sociology]
Demographics]
16. TUTORIAL “EE”—Census Data
2. TUTORIAL “P”—Cancer Gene [Field:
Analysis [Field: Demography—Census]
Medical Informatics & Bioinformatics]
17. TUTORIAL “FF”—Linear & Logistic
3. TUTORIAL “Q”—Clustering of Shoppers
Regression—Ozone Data [Field:
[Field: CRM—Clustering Techniques]
Environment]
4. TUTORIAL “R”—Credit Risk
18. TUTORIAL “GG”—R-Language
using Discriminant Analysis [Field:
Integration—DISEASE SURVIVAL
Financial—Banking]
ANALYSIS Case Study [Field: Survival
5. TUTORIAL “S”—Data Preparation and
Analysis—Medical Informatics]
Transformation [Field: Data Analysis]
19. TUTORIAL “HH”—Social Networks
6. TUTORIAL “T”—Model Deployment
Among Community Organizations
on New Data [Field: Deployment of
[Field: Social Networks—Sociology &
Predictive Models]
Medical Informatics]
7. TUTORIAL “V”—Heart Disease Visual
20. TUTORIAL “II”—Nairobi, Kenya
Data Mining Methods [Field: Medical
Baboon Project: Social Networking
Informatics]
xi
xii LIST OF TUTORIALS ON THE ELSEVIER COMPANION WEB PAGE
This book will help the novice user be- • Asking the wrong question—when
come familiar with data mining. Basically, looking for a rare phenomenon, it may
data mining is doing data analysis (or statis- be helpful to identify the most common
tics) on data sets (often large) that have been pattern. These may lead to complex
obtained from potentially many sources. As analyses, as in item 3, but they may also
such, the miner may not have control of the be conceptually simple. Again, you may
input data, but must rely on sources that have need to take care that you don't overfit
gathered the data. As such, there are prob- the data.
lems that every data miner must be aware of • Don't become enamored with the data.
as he or she begins (or completes) a mining There may be a substantial history from
operation. I strongly resonated to the mate- earlier data or from domain experts that
rial on “The Top 10 Data Mining Mistakes,” can help with the modeling.
which give a worthwhile checklist: • Be wary of using an outcome variable (or
• Ensure you have a response variable and one highly correlated with the outcome
predictor variables—and that they are variable) and becoming excited about the
correctly measured. result. The predictors should be “proper”
• Beware of overfitting. With scads of predictors in the sense that they (a) are
variables, it is easy with most statistical measured prior to the outcome and (b)
programs to fit incredibly complex are not a function of the outcome.
models, but they cannot be reproduced. It • Do not discard outliers without solid
is good to save part of the sample to use justification. Just because an observation
to test the model. Various methods are is out of line with others is insufficient
offered in this book. reason to ignore it. You must check the
• Don't use only one method. Using only circumstances that led to the value. In
linear regression can be a problem. any event, it is useful to conduct the
Try dichotomizing the response or analysis with the observation(s) included
categorizing it to remove nonlinearities and excluded to determine the sensitivity
in the response variable. Often, there are of the results to the outlier.
clusters of values at zero, which messes • Extrapolating is a fine way to go
up any normality assumption. This, of broke; the best example is the stock
course, loses information, so you may market. Stick within your data, and
want to categorize a continuous response if you must go outside, put plenty
variable and use an alternative to of caveats. Better still, restrain the
regression. Similarly, predictor variables impulse to extrapolate. Beware that
may need to be treated as factors rather pictures are often far too simple and
than linear predictors. A classic example we can be misled. Political campaigns
is using marital status or race as a linear oversimplify complex problems (“my
predictor when there is no order. opponent wants to raise taxes”; “my
xiii
xiv FOREWORD 1 FOR 1st EDITION
opponent will take us to war”) when using mean replacement, almost the
the realities may imply we have same set of predictor variables surfaced,
some infrastructure needs that can be but the residual sum of squares was 20.
handled only with new funding or we I then used multiple imputation and
have been attacked by some bad guys. found approximately the same set of
Be wary of your data sources. If you are predictors but had a residual sum of
combining several sets of data, they need squares (median of 20 imputations) of
to meet a few standards: 25. I find that mean replacement is rather
• The definitions of variables that are optimistic but surely better than relying
being merged should be identical. Often, on only complete cases. Using stepwise
they are close but not exact (especially regression, I find it useful to replicate
in metaanalysis where clinical studies it with a bootstrap or with multiple
may have somewhat different definitions imputations. However, with large data
due to different medical institutions or sets, this approach may be expensive
laboratories). computationally.
• Be careful about missing values. Often, To conclude, there is a wealth of material
when multiple data sets are merged, in this handbook that will repay study.
missing values can be induced: one
variable isn't present in another data set; Peter A. Lachenbruch
what you thought was a unique variable Oregon State University, Corvallis, OR,
name was slightly different in the two United States
sets, so you end up with two variables American Statistical Association,
that both have a lot of missing values. Alexandria, VA, United States
• How you handle missing values can be Johns Hopkins University, Baltimore,
crucial. In one example, I used complete MD, United States
cases and lost half of my sample; all UCLA, Los Angeles, CA, United States
variables had at least 85% completeness, University of Iowa, Iowa City, IA,
but when put together, the sample lost United States
half of the data. The residual sum of University of North Carolina, Chapel
squares from a stepwise regression was Hill, NC, United States
about 8. When I included more variables
Foreword 2 for 1st Edition
xv
xvi FOREWORD 2 FOR 1st EDITION
However, the book is best read a few the excellent “History of Statistics and Data
chapters at a time while actively doing Mining” chapter and chapters 16, 17, and
the data mining rather than read cover to 18. These are broadly applicable and should
cover (a daunting task for a book this size). be read by even the most experienced data
Practitioners will appreciate tutorials that miners.
match their business objectives and choose The Handbook of Statistical Analysis and
to ignore other tutorials. They may choose Data Mining Applications is an exceptional
to read sections on a particular algorithm to book that should be on every data miner's
increase insight into that algorithm and then bookshelf or, better yet, found lying open
decide to add a second algorithm after the next to their computer.
first is mastered. For those new to a partic-
ular software tool highlighted in the tutori- Dean Abbott
als section, the step-by-step approach will Abbott Analytics, San Diego, CA,
operate much like a user's manual. Many United States
chapters stand well on their own, such as
Preface
xvii
xviii PREFACE
turn the key in the ignition, step on the gas is enough here to permit you to construct
and the brake at the right times, and turn the “smart enough” business operations with a
wheel to change direction in a safe manner, relatively small amount of the right informa-
and voilà, you are an expert user of the very tion. James Taylor developed this concept
complex technology under the hood. The for automating operational decision-making
other half of the story is the instruction man- in the area of enterprise decision man-
ual and the driver's education course that agement (Raden and Taylor, 2007). Taylor
help you to learn how to drive. recognized that companies need decision-
This book provides the instruction man- making systems that are automated enough
ual and a series of tutorials to train you how to keep up with the volume and time-critical
to do data mining in many subject areas. We nature of modern business operations.
provide both the right tools and the right These decisions should be deliberate, pre-
intuitive explanations (rather than formal cise, and consistent across the enterprise;
mathematical definitions) of the data mining smart enough to serve immediate needs
process and algorithms, which will enable appropriately; and agile enough to adapt
even beginner data miners to understand the to new opportunities and challenges in the
basic concepts necessary to understand what company. The same concept can be applied
they are doing. In addition, we provide many to nonoperational systems for customer re-
tutorials in many different industries and lationship management (CRM) and market-
businesses (using many of the most common ing support. Even though a CRM model for
data mining tools) to show how to do it. cross sell may not be optimal, it may enable
several times the response rate in prod-
uct sales following a marketing campaign.
OVERALL ORGANIZATION Models like this are “smart enough” to drive
OF THIS BOOK companies to the next level of sales. When
models like this are proliferated through-
We have divided the chapters in this book out the enterprise to lift all sales to the next
into four parts to guide you through the as- level, more refined models can be developed
pects of predictive analytics. Part I covers the to do even better. This e nterprise-wide “lift”
history and process of predictive analytics. in intelligent operations can drive a com-
Part II discusses the algorithms and methods pany through evolutionary rather than rev-
used. Part III is a group of tutorials, which olutionary changes to reach long-term goals.
serve in principle as Rome served—as the Companies can leverage “smart enough”
central governing influence. Part IV presents decision systems to do likewise in their pur-
some advanced topics. The central theme of suit of optimal profitability in their business.
this book is the education and training of Clearly, the use of this book and these tools
beginning data mining practitioners, not the will not make you experts in data mining.
rigorous academic preparation of algorithm Nor will the explanations in the book per-
scientists. Hence, we located the tutorials in mit you to understand the complexity of the
the middle of the book in Part III, flanked by theory behind the algorithms and methodol-
topical chapters in Parts I, II, and IV. ogies so necessary for the academic student.
This approach is “a mile wide and an inch But we will conduct you through a relatively
deep” by design, but there is a lot packed into thin slice across the wide practice of data
that inch. There is enough here to stimulate mining in many industries and disciplines.
you to take deeper dives into theory, and there We can show you how to create powerful
PREFACE xix
predictive models in your own organization Coauthor Gary Miner wishes to thank his
in a relatively short period of time. In addi- wife, Linda A. Winters-Miner, PhD, who has
tion, this book can function as a springboard been working with Gary on similar books over
to launch you into higher-level studies of the the past 30 years and wrote several of the tu-
theory behind the practice of data mining. torials included in this book, using real-world
If we can accomplish those goals, we will data. Gary also wishes to thank the following
have succeeded in taking a significant step in people from his office who helped in various
bringing the practice of data mining into the ways, including Angela Waner, Jon Hillis, Greg
mainstream of business analysis. Sergeant, and Dr. Thomas Hill, who gave per-
The three coauthors could not have done mission to use and also edited a group of the
this book completely by themselves, and tutorials that had been written over the years
we wish to thank the following individuals, by some of the people listed as guest authors in
with the disclaimer that we apologize if, by this book. Dr. Dave Dimas, of the University of
our neglect, we have left out of this “thank- California—Irvine, has also been very helpful
you list” anyone who contributed. in providing suggestions for enhancements for
Foremost, we would like to thank ac- this second edition—THANK YOU DAVE !!!
quisitions editor (name to use?) and others Without all the help of the people men-
(names). Bob Nisbet would like to honor tioned here and maybe many others we failed
and thank his wife, Jean Nisbet, PhD, who to specifically mention, this book would never
blasted him off in his technical career by re- have been completed. Thanks to you all!
typing his PhD dissertation five times (be-
fore word processing) and assumed much
of the family's burdens during the writing Bob Nisbet
of this book. Bob also thanks Dr. Daniel B. Gary Miner
Botkin, the famous global ecologist, for in- Ken Yale
troducing him to the world of modeling and
exposing him to the distinction between
viewing the world as machine and viewing Reference
it as organism. And thanks are due to Ken Raden, N., Taylor, J., 2007. Smart Enough Systems: How to
Reed, PhD, for inducting Bob into the prac- Deliver Competitive Advantage by Automating Hidden
Decisions. Prentice Hall, NJ, ISBN: 9780132713061.
tice of data mining.
Introduction
Often, data analysts are asked, “What very different ways of arriving at the same
are statistical analysis and data mining?” In conclusion, a decision. We will introduce
this book, we will define what data mining some basic analytic history and theory in
is from a procedural standpoint. But most Chapters 1 and 2.
people have a hard time relating what we The basic process of analytic modeling is
tell them to the things they know and under- presented in Chapter 3. But it may be diffi-
stand. Before moving on into the book, we cult for you to relate what is happening in
would like to provide a little background for the process without some sort of tie to the
data mining that everyone can relate to. The real world that you know and enjoy. In many
Preface describes the many changes in ac- ways, the decisions served by analytic mod-
tivities related to data mining since the first eling are similar to those we make every day.
edition of this book was published in 2009. These decisions are based partly on patterns
Now, it is time to dig deeper and discuss the of action formed by experience and partly by
differences between statistical analysis and intuition.
data mining (aka predictive analytics).
Statistical analysis and data mining are PATTERNS OF ACTION
two methods for simulating the unconscious
operations that occur in the human brain to A pattern of action can be viewed in
provide a rationale for decision-making and terms of the activities of a hurdler on a
actions. Statistical analysis is a very directed race track. The runner must start success-
rationale that is based on norms. We all think fully and run to the first hurdle. He must
and make decisions on the basis of norms. decide very quickly how high to jump to
For example, we consider (unconsciously) clear the hurdle. He must decide when and
what the norm is for dress in a certain situa- in what sequence to move his legs to clear
tion. Also, we consider the acceptable range the hurdle with minimum effort and with-
of variation in dress styles in our culture. out knocking it down. Then, he must run
Based on these two concepts, the norm and a specified distance to the next hurdle and
the variation around that norm, we render do it all over again several times, until he
judgments like “that man is inappropriately crosses the finish line. Analytic modeling is
dressed.” Using similar concepts of mean a lot like that.
and standard deviation, statistical analy- The training of the hurdler's “model” of
sis proceeds in a very logical way to make action to run the race happens in a series of
very similar judgments (in principle). On operations:
the other hand, data mining learns case by
case and does not use means or standard • Run slow at first.
deviations. Data mining algorithms build • Practice takeoff from different positions
patterns, clarifying the pattern as each case to clear the hurdle.
is submitted for processing. These are two • Practice different ways to move the legs.
xxi
xxii INTRODUCTION
• Determine the best ways to do each activity. two classes (for dichotomous keys) and those
• Practice the best ways for each activity who don't. Along with this joke is a similar
over and over again. recognition from the outside that taxono-
mists are divided also into two classes: the
This practice trains the sensory and motor
“lumpers” (who combine several species into
neurons to function together most efficiently.
one) and the “splitters” (who divide one spe-
Individual neurons in the brain are “trained”
cies into many). These distinctions point to
in practice by adjusting signal strengths and
a larger dichotomy in the way people think.
firing thresholds of the motor nerve cells. The
In ecology, there used to be two schools
performance of a successful hurdler follows
of thought: autoecologists (chemistry, phys-
the “model” of these activities and the process
ics, and mathematics explain all) and the
of coordinating them to run the race. Creation
synecologists (organism relationships in
of an analytic “model” of a business process to
their environment explain all). It wasn't until
predict a desired outcome follows a very simi-
the 1970s that these two schools of thought
lar path to the training regimen of a hurdler. We
learned that both perspectives were needed
will explore this subject further in Chapter 3
to understand the complexities in ecosys-
and apply it to develop a data mining process
tems (but more about that later). In business,
that expresses the basic activities and tasks per-
there are the “big picture” people versus
formed in creating an analytic model.
“detail” people. Some people learn by fol-
lowing an intuitive pathway from general to
HUMAN INTUITION specific (deduction). Often, we call them “big
picture” people. Other people learn by fol-
In humans, the right side of the brain is lowing an intuitive pathway from specific to
the center for visual and esthetic sensibil- general (inductive). Often, we call them “de-
ities. The left side of the brain is the center tail” people. Similar distinctions are reflected
for quantitative and time-regulated sensi- in many aspects of our society. In Chapter 1,
bilities. Human intuition is a blend of both we will explore this distinction to a greater
sensibilities. This blend is facilitated by the depth in regards to the development of sta-
neural connections between the right side tistical and data mining theory through time.
of the brain and the left side. In women, the Many of our human activities involve
number of neural connections between the finding patterns in the data input to our sen-
right and left sides of the brain is 20% greater sory systems. An example is the mental pat-
(on average) than in men. This higher con- tern that we develop by sitting in a chair in
nectivity of women's brains enables them to the middle of a shopping mall and making
exercise intuitive thinking to a greater extent some judgment about patterns among its cli-
than men. Intuition “builds” a model of re- entele. In one mall, people of many ages and
ality from both quantitative building blocks races may intermingle. You might conclude
and visual sensibilities (and memories). from this pattern that this mall is located in
an ethnically diverse area. In another mall,
you might see a very different pattern. In
PUTTING IT ALL one mall in Toronto, a great many of the
TOGETHER stores had Chinese titles and script on the
windows. One observer noticed that he was
Biological taxonomy students claim (in the only non-Asian seen for a half hour. This
jest) that there are two kinds of people in led to the conclusion that the mall catered
taxonomy—those who divide things up into to the Chinese community and was owned
Discovering Diverse Content Through
Random Scribd Documents
The Project Gutenberg eBook of Alcoholic
Fermentation
This ebook is for the use of anyone anywhere in the United
States and most other parts of the world at no cost and with
almost no restrictions whatsoever. You may copy it, give it away
or re-use it under the terms of the Project Gutenberg License
included with this ebook or online at www.gutenberg.org. If you
are not located in the United States, you will have to check the
laws of the country where you are located before using this
eBook.
Language: English
Arthur Harden
MONOGRAPHS ON BIOCHEMISTRY
EDITED BY
R. H. A. PLIMMER, D.Sc.
AND
F. G. HOPKINS, M.A., M.B., D.Sc., F.R.S.
GENERAL PREFACE.
MONOGRAPHS ON BIOCHEMISTRY
Royal 8vo.
THE NATURE OF ENZYME ACTION. By W. M. Bayliss, D.Sc., F.R.S. Third Edition. 5s.
net.
THE CHEMICAL CONSTITUTION OF THE PROTEINS. By R. H. A. Plimmer, D.Sc. Part
I.—Analysis. Second Edition, Revised and Enlarged. 5s. 6d. net. Part II.—
Synthesis, etc. Second Edition, Revised and Enlarged. 3s. 6d. net.
THE GENERAL CHARACTERS OF THE PROTEINS. By S. B. Schryver, Ph.D., D.Sc. 2s.
6d. net.
THE VEGETABLE PROTEINS. By Thomas B. Osborne, Ph.D. 3s. 6d. net.
THE SIMPLE CARBOHYDRATES AND THE GLUCOSIDES. By E. Frankland Armstrong,
D.Sc., Ph.D. Second Edition, Revised and Enlarged. 5s. net.
THE FATS. By J. B. Leathes, F.R.S., M.A., M.B., F.R.C.S. 4s. net.
ALCOHOLIC FERMENTATION. By A. Harden, Ph.D., D.Sc., F.R.S. Second Edition. 4s.
net.
THE PHYSIOLOGY OF PROTEIN METABOLISM. By E. P. Cathcart, M.D., D.Sc. 4s.
6d. net.
SOIL CONDITIONS AND PLANT GROWTH. By E. J. Russell, D.Sc. 5s. net.
OXIDATIONS AND REDUCTIONS IN THE ANIMAL BODY. By H. D. Dakin, D.Sc.,
F.I.C. 4s. net.
THE SIMPLER NATURAL BASES. By G. Barger, M.A., D.Sc. 6s. net.
NUCLEIC ACIDS. THEIR CHEMICAL PROPERTIES AND PHYSIOLOGICAL CONDUCT.
By Walter Jones, Ph.D. 3s. 6d. net.
THE DEVELOPMENT AND PRESENT POSITION OF BIOLOGICAL CHEMISTRY. By F.
Gowland Hopkins, M.A., M.B., D.Sc., F.R.S.
THE POLYSACCHARIDES. By Arthur R. Ling, F.I.C.
COLLOIDS. By W. B. Hardy, M.A., F.R.S.
RESPIRATORY EXCHANGE IN ANIMALS. By A. Krogh, Ph.D.
PROTAMINES AND HISTONES. By A. Kossel, Ph.D.
LECITHIN AND ALLIED SUBSTANCES. By H. Maclean, M.D., D.Sc.
THE ORNAMENTAL PLANT PIGMENTS. By A. G. Perkin, F.R.S.
CHLOROPHYLL AND HAEMOGLOBIN. By.H. J. Page, B.Sc.
ORGANIC COMPOUNDS OF ARSENIC AND ANTIMONY. By Gilbert T. Morgan, D.Sc.,
F.I.C.
SECOND EDITION
In the New Edition no change has been made in the scope of the
work. The rapid progress of the subject has, however, rendered
necessary many additions to the text and a considerable increase in
the bibliography.
A. H.
May,1914.
CONTENTS.
CHAPTER PAGE
I. 1
Historical Introduction
II. 18
Zymase and its Properties
III. 41
The Function of Phosphates in Alcoholic
Fermentation
IV. 59
The Co-Enzyme of Yeast-Juice
V. 70
Action of Some Inhibiting and Accelerating
Agents on the Enzymes of Yeast-Juice
VI. 81
Carboxylase
VII. 85
The By-Products of Alcoholic
Fermentation
VIII. 96
The Chemical Changes involved in
Fermentation
IX. 119
The Mechanism of Fermentation
136
Bibliography
155
Index
CHAPTER I.
HISTORICAL INTRODUCTION.
[p001]
The problem of alcoholic fermentation, of the origin and nature of
that mysterious and apparently spontaneous change which
converted the insipid juice of the grape into stimulating wine, seems
to have exerted a fascination over the minds of natural philosophers
from the very earliest times. No date can be assigned to the first
observation of the phenomena of the process. History finds man in
the possession of alcoholic liquors, and in the earliest chemical
writings we find fermentation, as a familiar natural process, invoked
to explain and illustrate the changes with which the science of those
early days was concerned. Throughout the period of alchemy
fermentation plays an important part; it is, in fact, scarcely too much
to say that the language of the alchemists and many of their ideas
were founded on the phenomena of fermentation. The subtle
change in properties permeating the whole mass of material, the
frothing of the fermenting liquid, rendering evident the vigour of the
action, seemed to them the very emblems of the mysterious process
by which the long sought for philosopher's stone was to convert the
baser metals into gold. As chemical science emerged from the mists
of alchemy, definite ideas about the nature of alcoholic fermentation
and of putrefaction began to be formed. Fermentation was
distinguished from other chemical changes in which gases were
evolved, such as the action of acids on alkali carbonates (Sylvius de
le Boë, 1659); the gas evolved was examined and termed gas
vinorum, and was distinguished from the alcohol with which it had at
first been confused (van Helmont, 1648); afterwards it was found
that like the gas from potashes it was soluble in water (Wren, 1664).
The gaseous product of fermentation and putrefaction was identified
by MacBride, in 1764, with the fixed air of Black, whilst Cavendish in
1766 showed that fixed air alone was evolved in alcoholic
fermentation and that a mixture of this with inflammable air was
produced by putrefaction. In the meantime it had been recognised
that only sweet liquors could be fermented ("Ubi notandum, nihil
fermentare quod non sit dulce," Becher, 1682), and finally Cavendish
[p002] [1776] determined the proportion of fixed air obtainable from
sugar by fermentation and found it to be 57 per cent. It gradually
became recognised that fermentation might yield either spirituous or
acid liquors, whilst putrefaction was thought to be an action of the
same kind as fermentation, differing mainly in the character of the
products (Becher).
As regards the nature of the process very confused ideas at first
prevailed, but in the time of the phlogistic chemists a definite theory
of fermentation was proposed, first by Willis (1659) and afterwards
by Stahl [1697], the fundamental idea of which survived the
overthrow of the phlogistic system by Lavoisier and formed the
foundation of the views of Liebig. To explain the spontaneous origin
of fermentation and its propagation from one liquid to another, they
supposed that the process consisted in a violent internal motion of
the particles of the fermenting substance, set up by an aqueous
liquid, whereby the combination of the essential constituents of this
material was loosened and new particles formed, some of which
were thrust out of the liquid (the carbon dioxide) and others
retained in it (the alcohol).
Stahl specifically states that a body in such a state of internal
disquietude can very readily communicate the disturbance to
another, which is itself at rest but is capable of undergoing a similar
change, so that a putrefying or fermenting liquid can set another
liquid in putrefaction or fermentation.
Taking account of the gradual accumulation of fact and theory we
find at the time of Lavoisier, from which the modern aspect of the
problem dates, that Stahl's theoretical views were generally
accepted. Alcoholic fermentation was known to require the presence
of sugar and was thought to lead to the production of carbon
dioxide, acetic acid, and alcohol.
The composition of organic compounds was at that time not
understood, and it was Lavoisier who established the fact that they
consisted of carbon, hydrogen, and oxygen, and who made
systematic analyses of the substances concerned in fermentation
(1784–1789). Lavoisier [1789] applied the results of these analyses
to the study of alcoholic fermentation, and by employing the
principle which he regarded as the foundation of experimental
chemistry, "that there is the same quantity of matter before and
after the operation," he drew up an equation between the quantities
of carbon, hydrogen, and oxygen in the original sugar and in the
resulting substances, alcohol, carbon dioxide, and acetic acid,
showing that the products contained the whole matter of the sugar,
and thus for the first time giving a clear view of the chemical [p003]
change which occurs in fermentation. The conclusion to which he
came was, we now know, very nearly accurate, but the research
must be regarded as one of those remarkable instances in which the
genius of the investigator triumphs over experimental deficiencies,
for the analytical numbers employed contained grave errors, and it
was only by a fortunate compensation of these that a result so near
the truth was attained.
Lavoisier's equation or balance sheet was as follows:—
Carbon. Hydrogen. Oxygen.
95·9 pounds of sugar (cane sugar)
consist of 26·8 7·7 61·4
These yield:—
57·7 pounds of alcohol containing 16·7 9·6 31·4
35·3 pounds of carbon dioxide
containing 9·9 — 25·4
2·5 pounds of acetic acid
containing 0·6 0·2 1·7
――― ――― ―――
Total contained in products 27·2 9·8 58·5
which sugar was converted into carbon dioxide and alcohol, as was
carefully pointed out by Schlossberger [1844] in a research on the
nature of yeast, carried out in Liebig's laboratory but without
decisive results.
Mitscherlich was also convinced of the vegetable character of
yeast, and showed [1841] that when yeast was placed in a glass
tube closed by parchment and plunged into sugar solution, the sugar
entered the glass tube and was there fermented, but was not
fermented outside the tube. He regarded this as a proof that
fermentation only occurred at the surface of the yeast cells, and
explained the process by contact action in the sense of the catalytic
action of Berzelius, rather than by Liebig's transference of molecular
instability. Similar results were obtained with an animal membrane
by Helmholtz [1843], who also expressed his conviction that yeast
was a vegetable organism.
In 1854 Schröder and von Dusch [1854, 1859, 1861] strongly
reinforced the evidence in favour of this view by succeeding in
preventing the putrefaction and fermentation of many boiled organic
liquids by the simple process of filtering all air which had access to
them through cotton-wool. These experiments, which were
continued until 1861, led to the conclusion that the spontaneous
alcoholic fermentation of liquids was due to living germs carried by
the air, and that when the air was passed through the cotton-wool
these germs were held back.
At the middle of the nineteenth century opinions with regard to
alcoholic fermentation, notwithstanding all that had been done, were
still divided. On the one hand Liebig's theory of fermentation was
widely held and taught. Gerhardt, for example, as late as 1856 in the
article on fermentation in his treatise on organic chemistry [1856],
gives entire support to Liebig's views, and his treatment of the
matter affords an interesting glimpse of the arguments which were
then held to be decisive. The grounds on which he rejects the
conclusions of Schwann and the other investigators who shared the
belief in the vegetable nature of yeast are that, although in some
cases animal and vegetable matter and infusions can be preserved
from change by the methods described by these authors, in others
they cannot, a striking case being that of milk, which even after
being boiled becomes sour even in filtered air, and this without
showing any trace of living organisms. The action of heat, sulphuric
acid, and filtration on the air is to remove, or destroy, not living
organisms but particles of decomposing matter, that is to say,
ferments which would add their activity to that of the oxygen of the
air. Moreover, many ferments, as for example diastase, act without
[p011] producing any insoluble deposit whatever which can be
regarded as an organism.
"Evidemment," he concludes, "la théorie de M. Liebig explique
seule tous les phénomènes de la manière la plus complète et la plus
logique; c'est à elle que tous les bons esprits ne peuvent manquer
de se rallier."
On the other hand it was held by many to have been shown that
Liebig's view of the origin of yeast by the action of the air on a
vegetable infusion was erroneous, and that fermentation only arose
when the air transferred to the liquid an active agent which could be
removed from it by sulphuric acid (Schulze), by heat (Schwann), and
by cotton-wool (Schröder and von Dusch). Accompanying alcoholic
fermentation there was a development of a living organism, the
yeast, and fermentation was believed, without any very strict proof,
to be a phenomenon due to the life and vegetation of this organism.
This doctrine seems indeed [Schrohe, 1904] to have been widely
taught in Germany from 1840–56, and to have established itself in
the practice of the fermentation industries.
In 1857 commenced the classical researches of Pasteur which
finally decided the question as to the origin and functions of yeast
and led him to the conclusion that "alcoholic fermentation is an act
correlated with the life and organisation of the yeast cells, not with
the death or putrefaction of the cells, any more than it is a
phenomenon of contact, in which case the transformation of sugar
would be accomplished in presence of the ferment without yielding
up to it or taking from it anything" [1860]. It is impossible here to
enter in detail into Pasteur's experiments on this subject, or indeed
to do more than indicate the general lines of his investigation. His
starting-point was the lactic acid fermentation.
The organism to which this change was due had hitherto escaped
detection, and as we have seen the spontaneous lactic fermentation
of milk was one of the phenomena adduced by Gerhardt (p. 10) in
favour of Liebig's views. Pasteur [1857] discovered the lactic acid
producing organism and convinced himself that it was in fact a living
organism and the active cause of the production of lactic acid. One
of the chief buttresses of Liebig's theory was thus removed, and
Pasteur next proceeded to apply the same method and reasoning to
alcoholic fermentation. Liebig's theory of the origin of yeast by the
action of the oxygen of the air on the nitrogenous matter of the
fermentable liquid was conclusively and strikingly disproved by the
brilliant device of producing a crop of yeast in a liquid medium
containing only comparatively [p012] simple substances of known
composition—sugar, ammonium tartrate and mineral phosphate.
Here there was obviously present in the original medium no matter
which could be put into a state of putrefaction by contact with
oxygen and extend its instability to the sugar. Any such material
must first be formed by the vital processes of the yeast. In the next
place Pasteur showed by careful analyses and estimations that,
whenever fermentation occurred, growth and multiplication of yeast
accompanied the phenomenon. The sugar, he proved, was not
completely decomposed into carbon dioxide and alcohol, as had
been assumed by Liebig (p. 8). A balance-sheet of materials and
products was constructed which showed that the alcohol and carbon
Welcome to our website – the ideal destination for book lovers and
knowledge seekers. With a mission to inspire endlessly, we offer a
vast collection of books, ranging from classic literary works to
specialized publications, self-development books, and children's
literature. Each book is a new journey of discovery, expanding
knowledge and enriching the soul of the reade
Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.
ebookluna.com