Full Download Database and Expert Systems Applications 29th International Conference DEXA 2018 Regensburg Germany September 3 6 2018 Proceedings Part II Sven Hartmann PDF
Full Download Database and Expert Systems Applications 29th International Conference DEXA 2018 Regensburg Germany September 3 6 2018 Proceedings Part II Sven Hartmann PDF
com
https://textbookfull.com/product/computer-vision-eccv-2018-workshops-
munich-germany-september-8-14-2018-proceedings-part-ii-laura-leal-
taixe/
textbookfull.com
https://textbookfull.com/product/payment-1st-edition-olivia-ashers/
textbookfull.com
https://textbookfull.com/product/management-control-with-integrated-
planning-models-and-implementation-for-sustainable-coordination-lukas-
rieder/
textbookfull.com
https://textbookfull.com/product/platform-chemical-biorefinery-future-
green-industry-1st-edition-satinder-kaur-brar/
textbookfull.com
https://textbookfull.com/product/transforming-negative-reactions-to-
clients-from-frustration-to-compassion-1st-edition-abraham-w-wolf/
textbookfull.com
https://textbookfull.com/product/the-invisible-hand-how-market-
economies-have-emerged-and-declined-since-ad-500-1st-edition-bas-van-
bavel/
textbookfull.com
Sven Hartmann · Hui Ma
Abdelkader Hameurlain
Günther Pernul
Roland R. Wagner (Eds.)
LNCS 11030
123
Lecture Notes in Computer Science 11030
Commenced Publication in 1973
Founding and Former Series Editors:
Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen
Editorial Board
David Hutchison
Lancaster University, Lancaster, UK
Takeo Kanade
Carnegie Mellon University, Pittsburgh, PA, USA
Josef Kittler
University of Surrey, Guildford, UK
Jon M. Kleinberg
Cornell University, Ithaca, NY, USA
Friedemann Mattern
ETH Zurich, Zurich, Switzerland
John C. Mitchell
Stanford University, Stanford, CA, USA
Moni Naor
Weizmann Institute of Science, Rehovot, Israel
C. Pandu Rangan
Indian Institute of Technology Madras, Chennai, India
Bernhard Steffen
TU Dortmund University, Dortmund, Germany
Demetri Terzopoulos
University of California, Los Angeles, CA, USA
Doug Tygar
University of California, Berkeley, CA, USA
Gerhard Weikum
Max Planck Institute for Informatics, Saarbrücken, Germany
More information about this series at http://www.springer.com/series/7409
Sven Hartmann Hui Ma
•
123
Editors
Sven Hartmann Günther Pernul
Clausthal University of Technology University of Regensburg
Clausthal-Zellerfeld Regensburg
Germany Germany
Hui Ma Roland R. Wagner
Victoria University of Wellington Johannes Kepler University
Wellington Linz
New Zealand Austria
Abdelkader Hameurlain
Paul Sabatier University
Toulouse
France
LNCS Sublibrary: SL3 – Information Systems and Applications, incl. Internet/Web, and HCI
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
This volume contains the papers presented at the 29th International Conference on
Database and Expert Systems Applications (DEXA 2018), which was held in
Regensburg, Germany, during September 3–6, 2018. On behalf of the Program
Committee, we commend these papers to you and hope you find them useful.
Database, information, and knowledge systems have always been a core subject of
computer science. The ever-increasing need to distribute, exchange, and integrate data,
information, and knowledge has added further importance to this subject. Advances in
the field will help facilitate new avenues of communication, to proliferate interdisci-
plinary discovery, and to drive innovation and commercial opportunity.
DEXA is an international conference series that showcases state-of-the-art research
activities in database, information, and knowledge systems. The conference and its
associated workshops provide a premier annual forum to present original research
results and to examine advanced applications in the field. The goal is to bring together
developers, scientists, and users to extensively discuss requirements, challenges, and
solutions in database, information, and knowledge systems.
DEXA 2018 solicited original contributions dealing with any aspect of database,
information, and knowledge systems. Suggested topics included, but were not limited
to:
– Acquisition, Modeling, Management, and Processing of Knowledge
– Authenticity, Privacy, Security, and Trust
– Availability, Reliability, and Fault Tolerance
– Big Data Management and Analytics
– Consistency, Integrity, Quality of Data
– Constraint Modeling and Processing
– Cloud Computing and Database-as-a-Service
– Database Federation and Integration, Interoperability, Multi-Databases
– Data and Information Networks
– Data and Information Semantics
– Data Integration, Metadata Management, and Interoperability
– Data Structures and Data Management Algorithms
– Database and Information System Architecture and Performance
– Data Streams and Sensor Data
– Data Warehousing
– Decision Support Systems and Their Applications
– Dependability, Reliability, and Fault Tolerance
– Digital Libraries and Multimedia Databases
– Distributed, Parallel, P2P, Grid, and Cloud Databases
– Graph Databases
– Incomplete and Uncertain Data
– Information Retrieval
VI Preface
Roland R. Wagner, to our publication chair, Vladimir Marik, and to our workshop
chairs, A Min Tjoa and Roland R. Wagner.
We wish to express our deep appreciation to Gabriela Wagner of the DEXA con-
ference organization office. Without her outstanding work and excellent support, this
volume would not have seen the light of day.
Finally, we like to thank Günther Pernul and his team for being our hosts during the
wonderful days in Regensburg.
General Chairs
Abdelkader Hameurlain IRIT, Paul Sabatier University, Toulouse, France
Günther Pernul University of Regensburg, Germany
Roland R. Wagner Johannes Kepler University Linz, Austria
Publication Chair
Vladimir Marik Czech Technical University, Czech Republic
Program Committee
Slim Abdennadher German University, Cairo, Egypt
Hamideh Afsarmanesh University of Amsterdam, The Netherlands
Riccardo Albertoni Institute of Applied Mathematics and Information
Technologies - Italian National Council of Research,
Italy
Idir Amine Amarouche University Houari Boumediene, Algeria
Rachid Anane Coventry University, UK
Annalisa Appice Università degli Studi di Bari, Italy
Mustafa Atay Winston-Salem State University, USA
Faten Atigui CNAM, France
Spiridon Bakiras Hamad bin Khalifa University, Qatar
Zhifeng Bao National University of Singapore, Singapore
Ladjel Bellatreche ENSMA, France
Nadia Bennani INSA Lyon, France
Karim Benouaret Université Claude Bernard Lyon 1, France
Benslimane Djamal Lyon 1 University, France
Morad Benyoucef University of Ottawa, Canada
Catherine Berrut Grenoble University, France
Athman Bouguettaya University of Sydney, Australia
Omar Boussaid University of Lyon/Lyon 2, France
Stephane Bressan National University of Singapore, Singapore
Barbara Catania DISI, University of Genoa, Italy
Michelangelo Ceci University of Bari, Italy
Richard Chbeir UPPA University, France
X Organization
Additional Reviewers
Valentyna Tsap Tallinn University of Technology, Estonia
Liliana Ibanescu AgroParisTech, France
Cyril Labbé Université Grenoble-Alpes, France
Zouhaier Brahmia University of Sfax, Tunisia
Dunren Che Southern Illinois University, USA
Feng George Yu Youngstown State University, USA
Gang Qian University of Central Oklahoma, USA
Lubomir Stanchev Cal Poly, USA
Jorge Martinez-Gil Software Competence Center Hagenberg, Austria
Loredana Caruccio University of Salerno, Italy
Valentina Indelli Pisano University of Salerno, Italy
Jorge Bernardino Polytechnic Institute of Coimbra, Portugal
Bruno Cabral University of Coimbra, Portugal
Paulo Nunes Polytechnic Institute of Guarda, Portugal
William Ferng Boeing, USA
Amin Mesmoudi LIAS/University of Poitiers, France
Sabeur Aridhi LORIA, University of Lorraine - TELECOM Nancy,
France
Julius Köpke Alpen Adria Universität Klagenfurt, Austria
Marco Franceschetti Alpen Adria Universität Klagenfurt, Austria
Meriem Laifa Bordj-Bouarreridj University, Algeria
Sheik Mohammad University of Sydney, Australia
Mostakim Fattah
Mohammed Nasser University of Sydney, Australia
Mohammed Ba-hutair
Ali Hamdi Fergani Ali University of Sydney, Australia
Masoud Salehpour University of Sydney, Australia
Adnan Mahmood Macquarie University, Australia
Wei Emma Zhang Macquarie University, Australia
Zawar Hussain Macquarie University, Australia
Hui Luo RMIT University, Australia
Sheng Wang RMIT University, Australia
Lucile Sautot AgroParisTech, France
Jacques Fize Cirad, Irstea, France
XIV Organization
Information Retrieval
Uncertain Information
Data Streams
Learning
Emerging Applications
Creating Time Series-Based Metadata for Semantic IoT Web Services. . . . . . 417
Kasper Apajalahti
Data Mining
Privacy
Text Processing
Data Semantics
Efficient Top-k Cloud Services Query Processing Using Trust and QoS . . . . . 203
Karim Benouaret, Idir Benouaret, Mahmoud Barhamgi,
and Djamal Benslimane
Answering Top-k Queries over Outsourced Sensitive Data in the Cloud. . . . . 218
Sakina Mahboubi, Reza Akbarinia, and Patrick Valduriez
Social Networks
Community Structure Based Shortest Path Finding for Social Networks . . . . . 303
Yale Chai, Chunyao Song, Peng Nie, Xiaojie Yuan, and Yao Ge
Contents – Part I XXIII
1 Introduction
In a recent study, it is showed that 90% of Web mail traffic is machine generated [1, 7,
8, 10]. As mentioned in [1], “A common characteristics of these machine generated
messages is that most of them are highly structured documents, with rich HTML
formatting, and they are repeated over and over, modulo minor variations, in the
global mail corpus. These characteristics clearly facilitate the application of auto-
mated data extraction and learning methods at a very large scale”. With a significant
chunk of web mail traffic composed of machine generated emails, it becomes a natural
goal to automatically extract the information from these emails, which must be acted
upon by the recipient by scheduled time, e.g., payment of utility bill by the due date.
Therefore, we define actionable information as “information that must be acted upon by
scheduled time, by the recipient”. The requirement to extract actionable information
from machine generated emails is present for many domains, such as flight information,
shipment arrival, payment due date, etc., [5].
A big challenge faced by such systems is: such emails contain personal informa-
tion. Therefore, it becomes difficult to get access to large corpus of labeled data to build
annotators based on supervised methods [1, 3]. Moreover, even though machine
generated emails are structured, their structure changes over time. Further, new service
providers join continuously. Hence, it becomes difficult for supervised methods to
model the tail traffic. For example, hotel reservation emails have more than 8000
different small providers representing more than 50% of the total such emails in our
database. Similarly, we have over one thousand small service providers for flight data,
representing more than 40% of all flight reservation emails. There exist supervised
methods [8, 9] that work on labeled data with varying degree of success.
In this paper, we present a novel method to extract actionable information from
emails. We show that the region in the email containing actionable information can be
represented as combination of one or more small template trees. These template trees
have significantly simpler structure compared to the original email. Further, these
template trees are highly repetitive across a diverse set of email from a given service
provider and contain all the information of our interest. We develop a principled and
scalable approach which exploits the semi-structured format of the emails to extract
these template trees. Requirement of training data, for building supervised annotators
over email data, can be reduced significantly, if such small and well-structured snippets
from the emails can be identified for information extraction.
Only information needed by our system is a domain specific dictionary. We call it
domain knowledge. The domain knowledge has been used in earlier systems to
automatically extract the information from HTML web pages [15]. Similarly, there are
existing systems to extract wrappers from HTML pages [16]. However, unlike a
wrapper a template tree is a sub-tree in the email HTML DOM tree. In [17], authors
present a system to extract templates of an entire web pages in an unsupervised manner.
On the other hand, we templatize only the structure of the core region of the emails,
containing the information of our interest, with the aid of domain knowledge.
However, the work in [16, 17] exposes that machine generated HTML documents
(web pages or emails) have a fixed structure, encoded with the actual data. We present
a novel system, that builds on these ideas to extract actionable information from emails.
We use the term ‘domain’ for an entire service. For instance, for extracting
information from flight related emails, ‘flight’ is the domain. A specific vendor in a
domain is called service provider, or just ‘provider’. In ‘flight’ domain, each airline is a
provider (e.g., ‘American Airlines’). We will use the emails from flight domain as
running example throughout the paper.
Domain specific dictionaries contain keywords and regex patterns specific to that
domain. Dictionaries are applicable on entire domain and are not specific to any
provider. Therefore, they are built using public information and it is much easier to
acquire and learn domain specific dictionaries as opposed to acquiring labelled data for
every provider (cf. Sect. 3.2).
The technique described in this paper is applicable across many domains for which
domain specific information can be represented in a dictionary. This is a highly generic
requirement, applicable across most domains, producing machine generated emails.
Random documents with unrelated
content Scribd suggests to you:
When greater strength was needed the thickness of the side walls
was increased to 30 ins. and that of the arch to six rings of brick.
The arch was built up from the springing lines on both sides at
the same time, four masons being employed. The rings were built
beginning with the intrados, which was brought up, say, a distance
of about 2 ft. from the springing line. Then the back of the ring was
well plastered with from 3⁄8 in. to 1⁄2 in. of mortar, and the second
ring brought up to the same height and plastered on the back, and
so on until the last ring was laid. After bringing the full width of the
arch up some distance, new laggings were placed on the ribs for an
additional height of 2 ft. and the same process was repeated. All the
space between the extrados of the masonry arch and the old lining
was compactly filled with dry rubble. When high enough so that the
hip segments had a foot or more bearing on the masonry the
segments were securely wedged and blocked up against the
brickwork, and the longitudinal 4 × 6 in. timbers removed. The
remaining space was now clear for completion of the arch, and both
sides were brought up until there was not sufficient space for four
masons to work, when the keying was completed by two masons
beginning at the completed and working back toward the toothed
end. The brickwork was built from the top of a staging-car.
Permanent Work.
Fig. 163.—Relining Timber Lined Tunnel, Great Northern Ry.
Longitudinal Section.
Fig. 164.—Construction of Centering Mullan Tunnel.
The mortar car was then run along, and enough mortar (1
cement to 3 sand) was run by the chute into each section to make
an 8-in. layer of concrete. As the car passed along to each section,
broken stone was shoveled into the last preceding section until all
the mortar was taken up. The walls were thus built up in 8-in.
layers, and became hard enough to support the arches in about 10
to 14 days. The arches were then allowed to rest on the wall, and
the posts of the remaining 5-ft. sections were removed, and the
concrete wall built up in the same way as before.
The average progress per working-day was 30 ft. of side wall, or
about 45 cu. yds.; and the average cost, including all work required
in removing the timber work, train service, lights and tools,
engineering and superintendence, and interest on plant, was $8 per
cubic yard.
The centering used for putting in the brick arches is shown in Fig.
165. From 3 ft. to 9 ft. of arch was put in at a time, the length
depending upon the nature of the ground. To remove the old timber
arch, one of the segments was partly sawed through; and then a
small charge of giant powder was exploded in it, the resulting
débris, cordwood, rock, etc., being caught by a platform car
extending underneath. From this car the débris was removed to
another car, which conveyed it out of the tunnel. The center was
then placed and the brickwork begun, the cement car shown in Fig.
164 being used for mixing the mortar. The size of the bricks used
was 21⁄2 + 21⁄2 + 9 ins., four rings making a 20-in. arch and giving
1.62 cu. yds. of masonry in the arch per lin. ft. of tunnel. The bricks
were laid in rowlock bond, two gangs, of three bricklayers and six
helpers each, laying about 12 lin. ft. per day. The brickwork cost
about $17 per cu. yd. The total cost of the new lining averaged
about $50 per lin. ft.
Trestles:
Caps and sills 8 pieces 8× 8 ins. × 20 ft.
Posts 18 „ 8× 8 „ × 11 „
Braces 16 „ 6× 4 „ × 7 „
Centerings:
Ribs 27 „ 2 × 18 „ × 7 „
Bracing 12 „ 2 × 8 „ × 7 „
Support to crown lagging 2 „ 6 × 6 „ × 10 „
Crown lagging 20 „ 3 × 6 „ × 2 „
Side lagging 30 „ 3 × 6 „ × 10 „
Side strips 2 „ 2 × 12 „ × 9 „
Blocking for rollers 1 „ 5 × 8 „ × 12 „
6 screw and roller castings complete with bolts and lever; 114 bolts 3⁄4-ins.
in diameter; 71⁄2 U. H. hexagonal nut and 2 cast washers each.
With this arrangement the progress made per day varied from 2
lin. ft. to 3 lin. ft. of lining complete. By work complete is meant the
entire lining, including stone packing between the brickwork and the
rock. On Feb. 23, 1900, 363 ft. of lining had been completed, at a
cost of $33.50 per lin. ft. This cost includes the cost of removing the
old timber, the loose rock above it, and all other work whatsoever.
CHAPTER XXIV.
VENTILATION.
LIGHTING.
Cost.—The cost of a tunnel will depend upon the cost of the two
principal operations required in its construction, viz., the excavation
of the cross section and the lining of the excavation with masonry,
metal, or timber. These two operations may in turn be subdivided, in
respect to expense, into cost of labor and cost of materials. It is a
comparatively simple matter to calculate the cost of the building
materials required to construct a tunnel; but it is very difficult to
estimate with accuracy what the cost of labor will be. The reason for
this is that it is impossible to foresee exactly what the conditions will
be; the character of the material may change greatly as the work
proceeds, increasing or decreasing the cost of excavation; water
may be encountered in quantities which will materially increase the
difficulties of the work, etc. Nevertheless, while accurate preliminary
estimates of cost are not practicable, it is always desirable to
attempt to obtain some idea of the probable expense of the work
before beginning it, and the more usual means of getting at this
point will be discussed here.
Two methods of estimating the cost of tunnel work are
employed. The first is to calculate the probable expense of the
various items of work, based upon the available data, per unit of
length, and then add to this a margin of at least 10% to allow for
contingencies; the second is to apply to the new work the unit cost
of some previous tunnel built under substantially the same
conditions. In the first method it is usual to consider the strutting
and hauling as constituting a part of the work of excavation. To
estimate the cost of excavation involves the consideration of three
general items, viz., the excavation proper, the strutting of the walls
of the excavation, and the hauling of the excavated materials and
the materials of construction.
The cost of excavating the preliminary headings or drifts is
greater per unit of material removed than that of excavating the
enlargement of the section. The cost of bottom drifts is also always
greater than that of top headings, the material penetrated remaining
the same. Mr. Rziha gives the comparative unit costs of excavating
drifts, headings, and enlargement of the profile as follows:—
Bottom drifts $9.20 per cu. yd.
Top headings 4.80 „ „ „
Enlargement of profile 2.84 „ „ „
The cost of hauling increases with the length of the tunnel. This
fact and amount of this increase are indicated by the following actual
prices for the Arlberg tunnel:—
Top heading $6.76 per cu. yd., increasing 37 cts. per mile
Bottom drift 7.40 „ „ „ „ 26 „ „ „
Enlargement of profile 2.70 „ „ „ „ 10 „ „ „
In all the prices given above, the cost of strutting and hauling is
included in the cost of excavation.
The cost of excavation is not always the same for the same
character of materials in different tunnels. The following figures
show the prices paid for the excavation of calcareous rock in four
different German tunnels:—
Berliner Nordhausen Wetzler R.R. $1.24 per cu. yd.
Ofen 1.30 „ „ „
Stafflach 2.76 „ „ „
Gries 1.92 „ „ „
The method of tunneling has little influence upon the cost of the
work, as shown by the following figures from tunnels excavated
through calcareous rock by different methods:—
Ofen tunnel Austrian method $93.19 per lin. ft.
Dorremberg tunnel Belgian method 86.08 „ „ „
Stafflach tunnel English method 91.69 „ „ „
SINGLE-TRACK TUNNELS.
Cost per Method of
Name of Tunnels. Quality of Soil.
Lin. Ft. Tunneling.
Mt. Cenis Gneiss, $82.27 Heading.
Stalletti Granite and quartz, 62.75 Austrian.
Marein Clay schist, 64.36 English.
Welsberg Gravel, 165.07 Austrian.
Sancina Clay of 1st variety, 129.40 Belgian.
Starre Clay of 2d variety, 191.61 Belgian.
Cristina Clay of 3d variety, 307.42 Italian.
Burk ... 83.90 Wide heading.
Brafford Ridge ... 85.33 Wide heading.
Dunbeithe Limestone, 70.47 Wide heading.
Fergusson Sandstone, 37.46[16] Wide heading.
Port Henry Limestone, 80.00[17] Wide heading.
Points Granite, 72.00[16] Wide heading.