Immediate download Web App Development and Real-Time Web Analytics with Python: Develop and Integrate Machine Learning Algorithms into Web Apps Nokeri ebooks 2024
Immediate download Web App Development and Real-Time Web Analytics with Python: Develop and Integrate Machine Learning Algorithms into Web Apps Nokeri ebooks 2024
com
https://ebookmeta.com/product/web-app-development-and-real-
time-web-analytics-with-python-develop-and-integrate-
machine-learning-algorithms-into-web-apps-nokeri/
OR CLICK BUTTON
DOWNLOAD NOW
https://ebookmeta.com/product/web-development-with-clojure-build-
bulletproof-web-apps-with-less-code-3rd-edition-dmitri-sotnikov-scot-
brown/
ebookmeta.com
https://ebookmeta.com/product/leadership-team-coaching-in-practice-
case-studies-on-developing-high-performing-teams-2nd-edition-peter-
hawkins/
ebookmeta.com
Codename Dweeb Friends to Lovers Christmas Military
Romance 1st Edition Mazzy King
https://ebookmeta.com/product/codename-dweeb-friends-to-lovers-
christmas-military-romance-1st-edition-mazzy-king/
ebookmeta.com
https://ebookmeta.com/product/machinability-and-tribological-
performance-of-advanced-alloys-2nd-edition-george-pantazopoulos/
ebookmeta.com
https://ebookmeta.com/product/budgeting-for-dummies-athena-valentine-
lent/
ebookmeta.com
https://ebookmeta.com/product/vultures-first-edition-dalpat-chauhan/
ebookmeta.com
https://ebookmeta.com/product/toddlers-parents-and-culture-1st-
edition-maria-a-gartstein/
ebookmeta.com
Web App Development
and Real-Time Web
Analytics with Python
Develop and Integrate Machine Learning
Algorithms into Web Apps
—
Tshepo Chris Nokeri
Web App Development
and Real-Time Web
Analytics with Python
Develop and Integrate Machine
Learning Algorithms into Web Apps
v
Table of Contents
Scatter Plot���������������������������������������������������������������������������������������������������������������������������� 32
Density Plot��������������������������������������������������������������������������������������������������������������������������� 34
Bar Chart������������������������������������������������������������������������������������������������������������������������������� 36
Pie Chart�������������������������������������������������������������������������������������������������������������������������������� 38
Sunburst�������������������������������������������������������������������������������������������������������������������������������� 38
Choropleth Map��������������������������������������������������������������������������������������������������������������������������� 41
Heatmap�������������������������������������������������������������������������������������������������������������������������������� 42
3D Charting��������������������������������������������������������������������������������������������������������������������������������� 43
Indicators������������������������������������������������������������������������������������������������������������������������������������ 44
Conclusion���������������������������������������������������������������������������������������������������������������������������������� 45
vi
Table of Contents
Meta Tag������������������������������������������������������������������������������������������������������������������������������������� 75
Practical Example����������������������������������������������������������������������������������������������������������������������� 75
Viewing Web Page Source���������������������������������������������������������������������������������������������������������� 78
Conclusion���������������������������������������������������������������������������������������������������������������������������������� 78
vii
Table of Contents
Button����������������������������������������������������������������������������������������������������������������������������������������� 94
Table������������������������������������������������������������������������������������������������������������������������������������������� 95
Conclusion���������������������������������������������������������������������������������������������������������������������������������� 97
viii
Table of Contents
Chapter 11: Integrating a Machine Learning Algorithm into a Web App�������������� 189
An Introduction to Linear Regression���������������������������������������������������������������������������������������� 189
An Introduction to sklearn��������������������������������������������������������������������������������������������������������� 190
Preprocessing��������������������������������������������������������������������������������������������������������������������������� 191
ix
Table of Contents
Index��������������������������������������������������������������������������������������������������������������������� 223
x
About the Author
Tshepo Chris Nokeri harnesses advanced analytics and
artificial intelligence to foster innovation and optimize
business performance. He delivers complex solutions to
companies in the mining, petroleum, and manufacturing
industries. He received a bachelor’s degree in information
management. He graduated with honours in business
science from the University of the Witwatersrand,
Johannesburg, on a Tata Prestigious Scholarship and a
Wits Postgraduate Merit Award. He was unanimously awarded the Oxford University
Press Prize. Tshepo has authored three books: Data Science Revealed (Apress, 2021),
Implementing Machine Learning in Finance (Apress, 2021), and Econometrics and Data
Science (Apress, 2022).
xi
About the Technical Reviewer
Brij Kishore Pandey works as a software engineer, architect,
and strategist at ADP. He has a wide interest in software
development using cutting-edge tools/technologies in
cloud computing, data engineering, data science, artificial
intelligence, and machine learning. He has 12 years of
experience working with global corporate leaders, including
JP Morgan Chase, American Express, 3M Company, Alaska
Airlines, Cigna Healthcare, and ADP.
xiii
Acknowledgments
Writing a single-authored book is demanding, but I received firm support and active
encouragement from my family and dear friends. Many heartfelt thanks to the Apress
team for their backing throughout the writing and editing process. And my humble
thanks to all of you for reading this; I earnestly hope you find it helpful.
xv
CHAPTER 1
import pandas as pd
df = pd.read_csv(r"filepath\.csv")
1
© Tshepo Chris Nokeri 2022
T. C. Nokeri, Web App Development and Real-Time Web Analytics with Python,
https://doi.org/10.1007/978-1-4842-7783-6_1
Chapter 1 Tabulating Data and Constructing Static 2D and 3D Charts
df = pd.read_excel(r"filepath\.xlsx")
Notice the difference between Listings 1-1 and 1-2 is the file extension (.csv for
Listing 1-1 and .xlsx for Listing 1-2).
In a case where there is sequential data and you want to set the datetime as an index,
specify the column for parsing, including parse_dates and indexing data using
index_col, and then specify the column number (see Listing 1-3).
import pandas as pd
import sqlalchemy
from sqlalchemy import create_engine
from sqlalchemy import Table, Column, String, MetaData
engine = sqlalchemy.create_engine(
sqlalchemy.engine.url.URL(
drivername="postgresql",
2
Other documents randomly have
different content
English. The speed was limited not by the memory or the computer
itself but by the input, which had to be prepared on tape by a typist.
Subsequently a scanning system capable of 2,400 words a minute
upped the speed considerably.
Impressive as the translator was, its impact was dulled after a
short time when it was found that a second “translation” was required
of the resulting pidgin English, particularly when the content was
highly technical. As a result, work is being done on more
sophisticated translation techniques. Making use of predictive
analysis, and “lexical buffers” which store all the words in a sentence
for syntactical analysis before final printout, scientists have improved
the translation a great deal. In effect, the computer studies the
structure of the sentence, determining whether modifiers belong with
subject or object, and checking for the most probable grammatical
form of each word as indicated by other words in the sentence.
The advanced nature of this method of translation requires the
help of linguistics experts. Among these is Dr. Sydney Lamb of the
University of California at Berkeley who is developing a computer
program for analysis of the structure of any language. One early
result of this study was the realization that not enough is actually
known of language structure and that we must backtrack and build a
foundation before proceeding with computer translation techniques.
Dr. Lamb’s procedure is to feed English text into the computer and
let it search for situations in which a certain word tends to be
preceded or followed by other words or groups of words. The
machine then tries to produce the grammatical structure, not
necessarily correctly. The researcher must help the machine by
giving it millions of words to analyze contextually.
What the computer is doing in hours is reproducing the evolution
of language and grammar that not only took place over thousands of
years, but is subject to emotion, faulty logic, and other inaccuracies
as well. Also working on the translation problem are the National
Bureau of Standards, the Army’s Office of Research and
Development, and others. The Army expects to have a computer
analysis in 1962 that will handle 95 per cent of the sentences likely
to be encountered in translating Russian into English, and to
examine foreign technical literature at least as far as the abstract
stage.
Difficult as the task seems, workers in the field are optimistic and
feel that it will be feasible to translate all languages, even the
Oriental, which seem to present the greatest syntactical barriers. An
indication of success is the announcement by Machine Translations
Inc. of a new technique making possible contextual translation at the
rate of 60,000 words an hour, a rate challenging the ability of even
someone coached in speed-reading! The remaining problem, that of
doing the actual reading and evaluation after translation, has been
brought up. This considerable task too may be solved by the
computer. The machines have already displayed a limited ability to
perform the task of abstracting, thus eliminating at the outset much
material not relevant to the task at hand. Another bonus the
computer may give us is the ideal international and technical
language for composing reports and papers in the first place. A
logical question that comes up in the discussion of printed language
translation is that of another kind of translation, from verbal input to
print, or vice versa. And finally from verbal Russian to verbal English.
The speed limitation here, of course, is human ability to accept a
verbal input or to deliver an output. Within this framework, however,
the computer is ready to demonstrate its great capability.
A recent article in Scientific American asks in its first sentence if a
computer can think. The answer to this old chestnut, the authors say,
is certainly yes. They then proceed to show that having passed this
test the computer must now learn to perceive, if it is to be considered
a truly intelligent machine. A computer that can read for itself, rather
than requiring human help, would seem to be perceptive and thus
qualify as intelligent.
Even early computers such as adding machines printed out their
answers. All the designers have to do is reverse this process so that
printed human language is also the machine’s input. One of the first
successful implementations of a printed input was the use of
magnetic ink characters in the Magnetic Ink Character Recognition
(MICR) system developed by General Electric. This technique called
for the printing of information on checks with special magnetic inks.
Processed through high-speed “readers,” the ink characters cause
electrical currents the computer can interpret and translate into
binary digits.
Close on the heels of the magnetic ink readers came those that
use the principle of optical scanning, analogous to the method man
uses in reading. This breakthrough came in 1961, and was effected
by several different firms, such as Farrington Electronics, National
Cash Register, Philco, and others, including firms in Canada and
England. We read a page of printed or written material with such
ease that we do not realize the complex way our brains perform this
miracle, and the optical scanner that “reads” for the computer
requires a fantastically advanced technology.
As the material to be read comes into the field of the scanner, it is
illuminated so that its image is distinct enough for the optical system
to pick up and project onto a disc spinning at 10,000 revolutions per
minute. In the disc are tiny slits which pass a certain amount of the
reflected light onto a fixed plate containing more slits. Light which
succeeds in getting through this second series of slits activates a
photoelectric cell which converts the light into proportionate electrical
impulses. Because the scanned material is moving linearly and the
rotating disc is moving transversely to this motion, the character is
scanned in two directions for recognition. Operating with great
precision and speed, the scanner reads at the rate of 240 characters
a second.
National Cash Register claims a potential reading rate for its
scanner of 11,000 characters per second, a value not reached in
practice only because of the difficulty of mechanically handling
documents at this speed. Used in post-office mail sorting, billing, and
other similar reading operations, optical scanners generally show a
perfect score for accuracy. Badly printed characters are rejected, to
be deciphered by a human supervisor.
It is the optical scanner that increased the speed of the Russian-
English translating computer from 40 to 2,400 words per minute. In
post-office work, the Farrington scanner sorts mail at better than
9,000 pieces an hour, rejecting all handwritten addresses. Since
most mail—85 per cent, the Post Office Department estimates—is
typed or printed, the electronic sorter relieves human sorters of most
of their task. Mail is automatically routed to proper bins or chutes as
fast as it is read.
The electronic readers have not been without their problems. A
drug firm in England had so much difficulty with one that it returned it
to the manufacturer. We have mentioned the one that was confused
by Christmas seals it took for foreign postage stamps. And as yet it
is difficult for most machines to read anything but printed material.
An attempt to develop a machine with a more general reading
ability, one which recognizes not only material in which exact criteria
are met, but even rough approximations, uses the gestalt or all-at-
once pattern principle. Using a dilating circular scanning method, the
“line drawing pattern recognizer” may make it possible to read
characters of varying sizes, handwritten material, and material not
necessarily oriented in a certain direction. A developmental model
recognizes geometric figures regardless of size or rotation and can
count the number of objects in its scope. Such experimental work
incidentally yields much information on just how the eye and brain
perform the deceptively simply tasks of recognition. Once 1970 had
been thought a target date for machine recognition of handwritten
material, but researchers at Bell Telephone Laboratories have
already announced such a device that reads cursive human writing
with an accuracy of 90 per cent.
The computer, a backward child, learned to write long before it
could read and does so at rates incomprehensible to those of us who
type at the blinding speed of 50 to 60 words a minute. A character-
generator called VIDIAC comes close to keeping up with the brain of
a high-speed digital computer and has a potential speed of 250,000
characters, or about 50,000 words, per second. It does this,
incidentally, by means of good old binary, 1-0 technique. To add to its
virtuosity, it has a repertoire of some 300 characters. Researchers
elsewhere are working on the problems to be met in a machine for
reading and printing out 1,000,000 characters per second!
None of us can talk or listen at much over 250 words a minute,
even though we may convince ourselves we read several thousand
words in that period of time. A simple test of ability to hear is to play
a record or tape at double speed or faster. Our brains just won’t take
it. For high-speed applications, then, verbalized input or output for
computers is interesting in theory only. However, there are occasions
when it would be nice to talk to the computer and have it talk back.
In the early, difficult days of computer development, say when
Babbage was working on his analytical engine, the designer
probably often spoke to his machine. He would have been stunned
to hear a response, of course, but today such a thing is becoming
commonplace. IBM has a computer called “Shoebox,” a term both
descriptive of size and refreshing in that is not formed of initial
capitals from an ad writer’s blurb. You can speak figures to Shoebox,
tell it what you want done with them, and it gets busy. This is
admittedly a baby computer, and it has a vocabulary of just 16
words. But it takes only 31 transistors to achieve that vocabulary,
and jumping the number of transistors to a mere 2,000 would
increase its word count to 1,000, which is the number required for
Basic English.
The Russians are working in the field of speech recognition too, as
are the Japanese. The latter are developing an ambitious machine
which will not only accept voice instructions, but also answer in kind.
To make a true speech synthetizer, the Japanese think they will need
a computer about 5,000 times as fast as any present-day type, so for
a while it would seem that we will struggle along with “canned” words
appropriately selected from tape memory.
We have mentioned the use of such a tape voice in the
computerized ground-controlled-approach landing system for
aircraft, and the airline reservation system called Unicall in which a
central computer answers a dialed request for space in less than
three seconds—not with flashing lights or a printed message but in a
loud clear voice. It must pain the computer to answer at the snail-like
human speed of 150 words a minute, so it salves its conscience by
handling 2,100 inputs without getting flustered.
The writer’s dream, a typewriter that has a microphone instead of
keys and clacks away merrily while you talk into it, is a dream no
longer. Scientists at Japan’s Kyoto University have developed a
computer that does just this. An early experimental model could
handle a hundred Japanese monosyllables, but once the
breakthrough was made, the Japanese quickly pushed the design to
the point where the “Sonotype” can handle any language. At the
same time, Bell Telephone Laboratories works on the problem from
the other end and has come up with a system for a typewriter that
talks. Not far behind these exotic uses of digital computer techniques
are such things as automatic translation of telephone or other
conversations.
Information Retrieval
It has been estimated that some 445 trillion words are spoken in
each 16-hour day by the world’s inhabitants, making ours a noisy
planet indeed. To bear out the “noisy” connotation, someone else
has reckoned that only about 1 per cent of the sounds we make are
real information. The rest are extraneous, incidentally telling us the
sex of the speaker, whether or not he has a cold, the state of his
upper plate, and so on. It is perhaps a blessing that most of these
trillions of words vanish almost as soon as they are spoken. The
printed word, however, isn’t so transient; it not only hangs around,
but also piles up as well. The pile is ever deeper, technical writings
alone being enough to fill seven 24-volume encyclopedias each day,
according to one source. As with our speech, perhaps only 1 per
cent of this outpouring of print is of real importance, but this does not
necessarily make what some have called the Information Explosion
any less difficult to cope with.
The letters IR once stood for infra-red; but in the last year or so
they have been appropriated by the words “information retrieval,”
one of the biggest bugaboos on the scientific horizon. It amounts to
saving ourselves from drowning in the fallout from typewriters all
over the earth. There are those cool heads who decry the pushing of
the panic button, professing to see no exponential increase in
literature, but a steady 8 per cent or so each year. The button-
pushers see it differently, and they can document a pretty strong
case. The technical community is suffering an embarrassment of
riches in the publications field.
While a doubling in the output of technical literature has taken the
last twelve years or so, the next such increase is expected in half
that time. Perhaps the strongest indication that IR is a big problem is
the obvious fact that nobody really knows just how much has been,
is being, or will be written. For instance, one authority claims
technical material is being amassed at the rate of 2,000 pages a
minute, which would result in far more than the seven sets of
encyclopedias mentioned earlier. No one seems to know for sure
how many technical journals there are in the world; it can be
“pinpointed” somewhere between 50,000 and 100,000. Selecting
one set of figures at random, we learn that in 1960 alone 1,300,000
different technical articles were published in 60,000 journals. Of
course there were also 60,000 books on technical subjects, plus
many thousands of technical reports that did not make the formal
journals, but still might contain the vital bit of information without
which a breakthrough will be put off, or a war lost. Our research
expenses in the United States ran about $13 billion in 1960, and the
guess is they will more than double by 1970. An important part of
research should be done in the library, of course, lest our scientist
spend his life re-inventing the wheel, as the saying goes.
To back up this saying are specific examples. For instance, a
scientific project costing $250,000 was completed a few days before
an engineer came across practically the identical work in a report in
the library. This was a Russian report incidentally, titled “The
Application of Boolean Matrix Algebra to the Analysis and Synthesis
of Relay Contact Networks.” In another, happier case, information
retrieval saved Esso Research & Engineering Co. a month of work
and many thousands of dollars when an alert—or lucky—literature
searcher came across a Swedish scientist’s monograph detailing
Esso’s proposed exploration. Another literature search obviated tests
of more than a hundred chemical compounds. Unfortunately not all
researchers do or can search the literature in all cases. There is
even a tongue-in-cheek law which governs this phenomenon
—“Mooer’s” Law states, “An information system will tend not to be
used whenever it is more painful for a customer to have information
than for him not to have it.”
As a result, it has been said that if a research project costs less
than $100,000 it is cheaper to go ahead with it than to conduct a
rigorous search of the literature. Tongue in cheek or not, this state of
affairs points up the need for a usable information retrieval system.
Fortune magazine reports that 10 per cent of research and
development expense could be saved by such a system, and 10 per
cent in 1960, remember, would have amounted to $1.3 billion. Thus
the prediction that IR will be a $100 million business in 1965 does
not seem out of line.
The Center for Documentation at Western Reserve University
spends about $6-1/2 simply in acquiring and storing a single article
in its files. In 1958 it could search only thirty abstracts of these
articles in an hour and realized that more speed was vital if the
Center was to be of value. As a result, a GE 225 computer IR
system was substituted. Now researchers go through the entire store
of literature—about 50,000 documents in 1960—in thirty-five
minutes, answering up to fifty questions for “customers.”
International Business Machines Corp.
The document file of this WALNUT information retrieval system contains the
equivalent of 3,000 books. A punched-card inquiry system locates the desired
filmstrip for viewing or photographic reproduction.
This image converter of the WALNUT system optically reduces and transfers
microfilm to filmstrips for storage. Each strip contains 99 document images. As a
document image is transferred from microfilm to filmstrip, the image converter
simultaneously assigns image file addresses and punches these addresses into
punched cards controlling the conversion process.
“How come they spend over a million on our new school, Miss Finch, and then
forget to put in computer machines?”
“’Tis one and the same Nature that rolls on her course, and whoever has
sufficiently considered the present state of things might certainly conclude as to
both the future and the past.”
—Montaigne
11: The Road Ahead