100% found this document useful (1 vote)
8 views

Functional Data Structures in R 1st Edition Thomas Mailund instant download

The document introduces functional data structures in R, emphasizing the differences between traditional mutable data structures and immutable functional ones. It discusses the trade-offs between programming time and execution speed when choosing data structures, suggesting that functional data structures are valuable for algorithmic programming in R. The book covers various abstract data structures, their implementations, and performance considerations, aiming to provide a comprehensive understanding for R programmers.

Uploaded by

melomskomra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
8 views

Functional Data Structures in R 1st Edition Thomas Mailund instant download

The document introduces functional data structures in R, emphasizing the differences between traditional mutable data structures and immutable functional ones. It discusses the trade-offs between programming time and execution speed when choosing data structures, suggesting that functional data structures are valuable for algorithmic programming in R. The book covers various abstract data structures, their implementations, and performance considerations, aiming to provide a comprehensive understanding for R programmers.

Uploaded by

melomskomra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Functional Data Structures in R 1st Edition

Thomas Mailund download

https://ebookmeta.com/product/functional-data-structures-
in-r-1st-edition-thomas-mailund/

Download more ebook from https://ebookmeta.com


We believe these products will be a great fit for you. Click
the link to download now, or visit ebookmeta.com
to discover even more!

Functional Programming in R 4 - Second Edition Thomas


Mailund

https://ebookmeta.com/product/functional-programming-
in-r-4-second-edition-thomas-mailund/

Functional Programming in R 4: Advanced Statistical


Programming for Data Science, Analysis, and Finance -
Second Edition Thomas Mailund

https://ebookmeta.com/product/functional-programming-
in-r-4-advanced-statistical-programming-for-data-science-
analysis-and-finance-second-edition-thomas-mailund/

Beginning Data Science in R 4: Data Analysis,


Visualization, and Modelling for the Data Scientist 2nd
Edition Thomas Mailund

https://ebookmeta.com/product/beginning-data-science-in-r-4-data-
analysis-visualization-and-modelling-for-the-data-scientist-2nd-
edition-thomas-mailund/

Hypnosis A Guide for Patients and Practitioners 1st


Edition David Waxman

https://ebookmeta.com/product/hypnosis-a-guide-for-patients-and-
practitioners-1st-edition-david-waxman/
Understanding Voice Problems A Physiological
Perspective for Diagnosis and Treatment 4th Edition
Raymond H Colton Janina K Casper Rebecca Leonard

https://ebookmeta.com/product/understanding-voice-problems-a-
physiological-perspective-for-diagnosis-and-treatment-4th-
edition-raymond-h-colton-janina-k-casper-rebecca-leonard/

The Complete Language of Flowers A Definitive and


Illustrated History Pocket Edition S. Theresa Dietz

https://ebookmeta.com/product/the-complete-language-of-flowers-a-
definitive-and-illustrated-history-pocket-edition-s-theresa-
dietz/

Creating a more resilient body 1st Edition Cory Cripe

https://ebookmeta.com/product/creating-a-more-resilient-body-1st-
edition-cory-cripe/

The Canadian Regime An Introduction to Parliamentary


Government in Canada Seventh Edition Patrick Malcolmson

https://ebookmeta.com/product/the-canadian-regime-an-
introduction-to-parliamentary-government-in-canada-seventh-
edition-patrick-malcolmson/

SharePoint Interview Questions and Answers Get the


birds eye view of what is required in SharePoint
interviews 1st Edition Koirala

https://ebookmeta.com/product/sharepoint-interview-questions-and-
answers-get-the-birds-eye-view-of-what-is-required-in-sharepoint-
interviews-1st-edition-koirala/
Devoted The Bonding Trials Book 2 1st Edition Evelyn
Flood

https://ebookmeta.com/product/devoted-the-bonding-trials-
book-2-1st-edition-evelyn-flood-3/
Thomas Mailund

Functional Data Structures in R


Advanced Statistical Programming in R
Thomas Mailund
Aarhus N, Denmark

Any source code or other supplementary material referenced by the


author in this book is available to readers on GitHub via the book's
product page, located at www.apress.com/9781484231432. For
more detailed information, please visit www.apress.com/source-
code.

ISBN 978-1-4842-3143-2 e-ISBN 978-1-4842-3144-9


https://doi.org/10.1007/978-1-4842-3144-9

Library of Congress Control Number: 2017960831

© Thomas Mailund 2017

This work is subject to copyright. All rights are reserved by the


Publisher, whether the whole or part of the material is concerned,
specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other
physical way, and transmission or information storage and retrieval,
electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.

Trademarked names, logos, and images may appear in this book.


Rather than use a trademark symbol with every occurrence of a
trademarked name, logo, or image we use the names, logos, and
images only in an editorial fashion and to the benefit of the
trademark owner, with no intention of infringement of the
trademark.The use in this publication of trade names, trademarks,
service marks, and similar terms, even if they are not identified as
such, is not to be taken as an expression of opinion as to whether or
not they are subject to proprietary rights.

While the advice and information in this book are believed to be true
and accurate at the date of publication, neither the authors nor the
editors nor the publisher can accept any legal responsibility for any
errors or omissions that may be made. The publisher makes no
warranty, express or implied, with respect to the material contained
herein.

Printed on acid-free paper

Distributed to the book trade worldwide by Springer


Science+Business Media New York, 233 Spring Street, 6th Floor,
New York, NY 10013. Phone 1-800-SPRINGER, fax (201) 348-4505,
e-mail orders-ny@springer-sbm.com, or visit
www.springeronline.com. Apress Media, LLC is a California LLC and
the sole member (owner) is Springer Science + Business Media
Finance Inc (SSBM Finance Inc). SSBM Finance Inc is a Delaware
corporation.
Introduction
This book gives an introduction to functional data structures. Many
traditional data structures rely on the structures being mutable. We
can update search trees, change links in linked lists, and rearrange
values in a vector. In functional languages, and as a general rule in
the R programming language, data is not mutable. You cannot alter
existing data. The techniques used to modify data structures to give
us efficient building blocks for algorithmic programming cannot be
used.
There are workarounds for this. R is not a pure functional
language, and we can change variable-value bindings by modifying
environments. We can exploit this to emulate pointers and
implement traditional data structures this way; or we can abandon
pure R programming and implement data structures in C/C++ with
some wrapper code so we can use them in our R programs. Both
solutions allow us to use traditional data structures, but the former
gives us very untraditional R code, and the latter has no use for
those not familiar with other languages than R.
The good news, though, is that we don’t have to reject R when
implementing data structures if we are willing to abandon the
traditional data structures instead. There are data structures that we
can manipulate by building new versions of them rather than
modifying them. These data structures, so-called functional data
structures , are different from the traditional data structures you
might know, but they are worth knowing if you plan to do serious
algorithmic programming in a functional language such as R.
There are not necessarily drop-in replacements for all the data
structures you are used to, at least not with the same runtime
performance for their operations, but there are likely to be
implementations for most abstract data structures you regularly use.
In cases where you might have to lose a bit of efficiency by using a
functional data structures instead of a traditional one, however, you
have to consider whether the extra speed is worth the extra time
you have to spend implementing a data structure in exotic R or in an
entirely different language.
There is always a trade-off when it comes to speed. How much
programming time is a speed-up worth? If you are programming in
R, chances are you value programmer-time over computer-time. R is
a high-level language and relatively slow compared to most other
languages. There is a price to providing higher levels of
expressiveness. You accept this when you choose to work with R.
You might have to make the same choice when it comes to selecting
a functional data structure over a traditional one, or you might
conclude that you really do need the extra speed and choose to
spend more time programming to save time when doing an analysis.
Only you can make the right choice based on your situation. You
need to find out the available choices to enable you to work data
structures when you cannot modify them.
Table of Contents
Chapter 1:​Introduction

Chapter 2:​Abstract Data Structures

Structure on Data

Abstract Data Structures in R

Implementing Concrete Data Structures in R

Asymptotic Running Time

Experimental Evaluation of Algorithms

Chapter 3:​Immutable and Persistent Data

Persistent Data Structures

List Functions

Trees

Random Access Lists

Chapter 4:​Bags, Stacks, and Queues

Bags

Stacks

Queues

Side Effects Through Environments

Side Effects Through Closures


A Purely Functional Queue

Time Comparisons

Amortized Time Complexity and Persistent Data


Structures

Double-Ended Queues

Lazy Queues

Implementing Lazy Evaluation

Lazy Lists

Amortized Constant Time, Logarithmic Worst-Case,


Lazy Queues

Constant Time Lazy Queues

Explicit Rebuilding Queue

Chapter 5:​Heaps

Leftist Heaps

Binomial Heaps

Splay Heaps

Plotting Heaps

Heaps and Sorting

Chapter 6:​Sets and Search Trees

Search Trees

Red-Black Search Trees


Insertion

Deletion

Visualizing Red-Black Trees

Splay Trees

Conclusions

Acknowledgements

Bibliography

Index
About the Author and About the
Technical Reviewer
About the Author
Thomas Mailund

is an associate professor in bioinformatics at Aarhus University,


Denmark. He has a background in math and computer science. For
the last decade, his main focus has been on genetics and
evolutionary studies, particularly comparative genomics, speciation,
and gene flow between emerging species. He has published
Beginning Data Science in R , Functional Programming in R , and
Metaprogramming in R with Apress, as well as other books.

About the Technical Reviewer


Karthik Ramasubramanian

works for one of the largest and fastest-


growing technology unicorns in India, Hike
Messenger, where he brings the best of
business analytics and data science
experience to his role. In his seven years of
research and industry experience, he has
worked on cross-industry data science
problems in retail, e-commerce, and
technology, developing and prototyping data-
driven solutions. In his previous role at Snapdeal, one of the largest
e-commerce retailers in India, he was leading core statistical
modeling initiatives for customer growth and pricing analytics. Prior
to Snapdeal, he was part of the central database team, managing
the data warehouses for global business applications of Reckitt
Benckiser (RB). He has vast experience working with scalable
machine learning solutions for industry, including sophisticated graph
network and self-learning neural networks. He has a master’s degree
in theoretical computer science from PSG College of Technology,
Anna University, and is a certified big data professional. He is
passionate about teaching and mentoring future data scientists
through different online and public forums. He enjoys writing poems
in his leisure time and is an avid traveler.
© Thomas Mailund 2017
Thomas Mailund, Functional Data Structures in R, https://doi.org/10.1007/978-1-
4842-3144-9_1

1. Introduction
Thomas Mailund1
(1) Aarhus N, Denmark

This book gives an introduction to functional data structures. Many


traditional data structures rely on the structures being mutable. We
can update search trees, change links in linked lists, and rearrange
values in a vector. In functional languages, and as a general rule in
the R programming language, data is not mutable. You cannot alter
existing data. The techniques used to modify data structures to give
us efficient building blocks for algorithmic programming cannot be
used.
There are workarounds for this. R is not a pure functional
language, and we can change variable-value bindings by modifying
environments. We can exploit this to emulate pointers and
implement traditional data structures this way; or we can abandon
pure R programming and implement data structures in C/C++ with
some wrapper code so we can use them in our R programs. Both
solutions allow us to use traditional data structures, but the former
gives us very untraditional R code, and the latter has no use for
those not familiar with other languages than R.
The good news, however, is that we don’t have to reject R when
implementing data structures if we are willing to abandon the
traditional data structures instead. There are data structures we can
manipulate by building new versions of them rather than modifying
them. These data structures, so-called functional data structures ,
are different from the traditional data structures you might know,
but they are worth knowing if you plan to do serious algorithmic
programming in a functional language such as R.
There are not necessarily drop-in replacements for all the data
structures you are used to, at least not with the same runtime
performance for their operations—but there are likely to be
implementations for most abstract data structures you regularly use.
In cases where you might have to lose a bit of efficiency by using a
functional data structure instead of a traditional one, you have to
consider whether the extra speed is worth the extra time you have
to spend implementing a data structure in exotic R or in an entirely
different language.
There is always a trade-off when it comes to speed. How much
programming time is a speed-up worth? If you are programming in
R, the chances are that you value programmer time over computer
time. R is a high-level language that is relatively slow compared to
most other languages. There is a price to providing higher levels of
expressiveness. You accept this when you choose to work with R.
You might have to make the same choice when it comes to selecting
a functional data structure over a traditional one, or you might
conclude that you really do need the extra speed and choose to
spend more time programming to save time when doing an analysis.
Only you can make the right choice based on your situation. You
need to find out the available choices to enable you to work data
structures when you cannot modify them.
© Thomas Mailund 2017
Thomas Mailund, Functional Data Structures in R, https://doi.org/10.1007/978-1-
4842-3144-9_2

2. Abstract Data Structures


Thomas Mailund1
(1) Aarhus N, Denmark

Before we get started with the actual data structures, we need to


get some terminologies and notations in place. We need to agree on
what an abstract data structure is—in contrast to a concrete one—
and we need to agree on how to reason with runtime complexity in
an abstract way.
If you are at all familiar with algorithms and data structures, you
can skim quickly through this chapter. There won’t be any theory
you are not already familiar with. Do at least skim through it,
though, just to make sure we agree on the notation I will use in the
remainder of the book.
If you are not familiar with the material in this chapter, I urge
you to find a text book on algorithms and read it. The material I
cover in this chapter should suffice for the theory we will need in this
book, but there is a lot more to data structures and complexity than
I can possibly cover in a single chapter. Most good textbooks on
algorithms will teach you a lot more, so if this book is of interest,
you should not find any difficulties in continuing your studies.

Structure on Data
As the name implies, data structures have something to do with
structured data. By data, we can just think of elements from some
arbitrary set. There might be some more structure to the data than
the individual data points, and when there is we keep that in mind
and will probably want to exploit that somehow. However, in the
most general terms, we just have some large set of data points.
So, a simple example of working with data would be imagining
we have this set of possible values—say, all possible names of
students at a university—and I am interested in a subset—for
example, the students that are taking one of my classes. A class
would be a subset of students, and I could represent it as the subset
of student names. When I get an email from a student, I might be
interested in figuring out if it is from one of my students, and in that
case, in which class. So, already we have some structure on the
data. Different classes are different subsets of student names. We
also have an operation we would like to be able to perform on these
classes: checking membership.
There might be some inherent structure to the data we work
with, which could be properties such as lexicographical orders on
names—it enables us to sort student names, for example. Other
structure we add on top of this. We add structure by defining classes
as subsets of student names. There is even a third level of structure:
how we represent the classes on our computer.
The first level of structure—inherent in the data we work with—is
not something we have much control over. We might be able to
exploit it in various ways, but otherwise, it is just there. When it
comes to designing algorithms and data structures, this structure is
often simple information; if there is order in our data, we can sort it,
for example. Different algorithms and different data structures make
various assumptions about the underlying data, but most general
algorithms and data structures make few assumptions. When I make
assumptions in this book, I will make those assumptions explicit.
The second level of structure —the structure we add on top of
the universe of possible data points—is information in addition to
what just exists out there in the wild; this can be something as
simple as defining classes as subsets of student names. It is
Exploring the Variety of Random
Documents with Different Content
back
back
back

You might also like