Accepted Posters & Demos
When: Wednesday December 14
Where: Faculty of Law (via Rosmini, 27) - Foyer
The following are the posters and demos accepted in the main conference. The abstracts can be visualized by clicking on the paper title.
Understanding the Behavior of Spark Workloads from Linux Kernel Parameters Perspective
Li Wang (Capital Normal University), Tianni Xu (CAS), Jing Wang (Capital Normal University), Weigong Zhang (Capital Normal University), Xiufeng Sui (CAS), Yungang Bao (CAS)
Despite a number of innovative computer systems with high capacity memory have been built, the design principles behind an operating system kernel have remained unchanged for decades. We argue that kernel parameters is a kind of special interface of operating system and must be factored into the operation and maintenance of datacenters. To shed some light on the effectiveness of tuning Linux parameters of virtual memory subsystem when running Spark workloads, we evaluate the benchmarks in a simple standalone deploy mode. Our performance results reveal that some of the Linux memory parameters must be carefully set to efﬁciently support these processing workloads. We hope this work yields insights for datacenter system operators.
Kanzi: A Distributed, In-memory Key-Value Store
Masoud Hemmatpour (Politecnico di Torino), Bartolomeo Montrucchio (Politecnico di Torino), Maurizio Rebaudengo (Politecnico di Torino), Mohammad Sadoghi (Purdue University)
Traditional database systems either sacrifice availability or partitionability at the cost of offering strict consistency guarantee of data. However, the significant growth of Web-scale applications and the wider array of emerging workloads demand revisiting the need for full transactional consistency. One new dominant class of workload is the ability to efficiently support single statement transaction consisting of either Get or Put operation; thus, simplifying the consistency model. These simple workloads have given rise to decade-long efforts for building efficient key-value stores that often rely on disk-resident and log-structured storage model that is distributed across many machines. To further expand the scope of key-value stores, in this paper, we introduce Kanzi, a distributed, in-memory key-value stored over shared-memory architecture enabled by remote direct memory access (RDMA) technology. The simple data and transaction model of our proposed Kanzi additionally may serve as a generic (embedded) caching layer to speed up any disk-resident data-intensive workloads.
Leveraging Query Sensitivity for Practical Private Web Search
Antoine Boutet (Université de Lyon), Sonia Ben Mokhtar (Université de Lyon), Léa Laporte (Université de Lyon), Albin Petit (Université de Lyon)
Several private Web search solutions have been proposed to preserve the user privacy while querying search engines. However, most of these solutions are costly in term of processing, network overhead and latency as they mostly rely on cryptographic techniques and/or the generation of false requests. Furthermore, all these solutions protect all queries similarly, ignoring whether the request contains sensitive content (e.g., religious, political or sexual orientation) or not. Based on an analysis of a real dataset of Web search requests, we show that queries related to sensitive matters are in practice a minority. As a consequence, protecting all queries similarly results in poor performance as a large number of queries get overprotected. In this paper, we propose a request sensitivity assessment module that we use for improving the practicability of existing private web search solutions. We assess the sensitivity of a request in two phases: a semantic sensitivity analysis (based on the topic of the query) and a request linkability analysis (based on the similarity between the current query and the query history of the requester). Finally, the sensitivity assessment is used to adapt the level of protection of a given query according to its identified degree of sensitivity: the more sensitive a query is, the more protected it will be. Experiments with a real dataset show that our approach can improve the performance of state-of-the-arts private Web search solutions by reducing the number of queries overprotected, while ensuring a similar level of privacy to the users, making them more likely to be used in practice.
Decentralized Scheduling for Tasklets
Janick Edinger (Universität Mannheim), Dominik Schäfer (Universität Mannheim), Christian Becker (Universität Mannheim)
In this poster abstract, we envision the evolution of the scheduler of the Tasklet system from a centralized to a distributed approach. The Tasklet system is a middleware for distributed computing applications that allows developers to offload computation to remote resources via self-contained units of computation - the so-called Tasklets. The current implementation of the Tasklet scheduler is based on a broker overlay network where one broker centrally manages a pool of resources. While this allows for a central control and a consistent global view on the resources in the system, this architecture involves the risk of performance bottlenecks which can be avoided by a decentralized resource management. This poster discusses three contributions. First, we present the Tasklet system and the current centralized scheduling algorithm. Second, we sketch a hybrid resource management that uses cache lists to avoid redundant communication between resource consumers and resource brokers. Finally, we propose a three-level scheduling architecture.
Hairspring: Online graph processing middleware for temporal networks
Jaewook Byun (KAIST), Sungpil Woo (KAIST), Daeyoung Kim (KAIST)
The researches of temporal graph have been conducted in interdisciplinary fields and applied to various kinds of networks; online social network, cell biology network, neural network, ecological network, etc. However, processing and understanding the networks would be complicated for application developers due to their high velocity and volume. Also, the heterogeneity of the networks incurs their unified usage. Therefore, we propose the novel online graph processing middleware for temporal networks, namely Hairspring. The middleware is based on the temporal property graph, which we leverage the property graph model, Blueprints, with temporal extensions. Based on the temporal property graph, we present and prototype the publish-subscribe architecture, which enables to publish graph elements and notify the processed graph elements of interest to subscribers on the fly.
Toward an Easy Configuration of Location Privacy Protection Mechanisms
Sophie Cerf (Univ. Grenoble-Alpes), Bogdan Robu (Univ. Grenoble-Alpes), Nicolas Marchand (Université de Lyon), Antoine Boutet (Université de Lyon), Vincent Primault (Université de Lyon), Sonia Ben Mokhtar (Université de Lyon), Sara Bouchenak (Université de Lyon)
The widespread adoption of Location-Based Services (LBSs) has come with controversy about privacy. While leveraging location information leads to improving services through geo-contextualization, it rises privacy concerns as new knowledge can be inferred from location records, such as home/work places, habits or religious beliefs. To overcome this problem, several Location Privacy Protection Mechanisms (LPPMs) have been proposed in the literature these last years. However, every mechanism comes with its own configuration parameters that directly impact the privacy guarantees and the resulting utility of protected data. In this context, it can be difficult for a non-expert system designer to choose appropriate configuration parameters to use according to the expected privacy and utility. In this paper, we present a framework enabling the easy configuration of LPPMs. To achieve that, our framework performs an offline, in-depth automated analysis of LPPMs to provide the formal relationship between their configuration parameters and both privacy and the utility metrics. This framework is modular: by using different metrics, a system designer is able to fine-tune her LPPM according to her expected privacy and utility guarantees (i.e., the guarantee itself and the level of this guarantee). To illustrate the capability of our framework, we analyse Geo-Indistinguishability (a well known differentially private LPPM) and we provide the formal relationship between its configuration parameter and two privacy and utility metrics.
Shlomi Dolev (Ben-Gurion University), Chryssis Georgiou (University of Cyprus), Ioannis Marcoullis (University of Cyprus), Elad Michael Schiller (Chalmers University of Technology)
We consider distributed systems built over dynamic asynchronous environments. A ``configuration'' is a set of active processors (servers), typically used to provide services to the system's participants. Over time, the members of the configuration set may leave, rendering service provision difficult and eventually impossible. It is thus necessary to ``reconfigure'' by removing departed configuration members from the configuration set and by including to this other participants. Existing reconfiguration solutions are based on starting the system in a consistent configuration and preserving it during execution, but this is not always achieved. Large-scale asynchronous message passing networks, are subject to transient faults due to hardware or software temporal malfunctions, short-lived violations of the assumed churn rates, or of correctness invariants, such as the uniform agreement among all current participants about the current configuration. Fault tolerant systems that are designed to converge back to the desired behavior after transient faults stop, are called ``self-stabilizing''. In this work we present the first, to our knowledge, self-stabilizing middleware service that achieves reconfiguration. Namely, the service automatically recovers the system from transient faults, by providing a new configuration known by all participants. Contrary to most of the existing non-stabilizing solutions, we only assume a bounded local storage and message size, and also consider the case where the current configuration membership has completely or to a large degree collapsed.
Chronograph–A Distributed Platform for Event-Sourced Graph Computing
Benjamin Erb (Ulm University), Frank Kargl (Ulm University))
Many data-driven applications require mechanisms for processing interconnected or graph-based data sets. Several platforms exist for offline processing of such data and fewer solutions address online computations on dynamic graphs. We combined a modified actor model, an event-sourced persistence layer, and a vertex-based, asynchronous programming model in order to unify event-driven and graph-based computations. Our distributed chronograph platform supports both near-realtime and batch computations on dynamic, event-driven graph topologies, and enables full history tracking of the evolving graphs over time.
Evidential Reasoning Based Fault Diagnosis
Balaji Viswanathan (IBM Research), Seep Goel (IBM Research), Mudit Verma (IBM Research), Ravi Kothari (IBM Research)
Fault diagnosis in IT environments is complicated because (i) most monitors have shared specificity (high amount of memory utilization can result from a large number of causes), (ii) it is hard to deploy and maintain enough sensors to ensure adequate coverage, and (iii) some functionality may be provided as-a-service by external parties with limited diagnostic information. To systematically incorporate uncertainty and to be able to fuse information from multiple sources, we propose the use of Dempster-Shafer Theory (DST) of evidential reasoning for fault diagnosis and show its efficacy in the context of a J2EE application.
ALMA – GC-assisted JVM Live Migration for Java Server Applications
Rodrigo Bruno (University of Lisbon), Paulo Ferreira (University of Lisbon)
Live migration of Java Virtual Machines (JVMs) consumes significant amounts of time and resources, imposing relevant application performance overhead. This problem is specially hard when memory modified by applications changes faster than it can be transferred through the network (to a remote host). Current solutions to this problem resort to several techniques which depend on high-speed networks and application throttling, require lots of CPU time to compress memory, or need explicit assistance from the application. We propose a novel approach, Garbage Collector (GC) assisted JVM Live Migration for Java Server Applications (ALMA). ALMA makes a snapshot to be migrated containing a minimal amount of application state, by taking into account the amount of reachable memory (i.e. live data) detected by the GC. The main novelty of ALMA is the following: ALMA analyzes the JVM heap looking for regions in which a collection phase is advantageous w.r.t. the network bandwidth available (i.e. it pays to collect because a significant amount of memory will not be part of the snapshot). ALMA is implemented on OpenJDK 8 and extends CRIU (a Linux disk-based process checkpoint/restore tool) to support process live migration over the network. We evaluate ALMA using well-known JVM performance benchmarks (SPECjvm2008 and DaCapo), and by comparing it to other previous approaches. ALMA shows very good performance results.
A Semantic-based Approach for Resource Discovery and Allocation in Distributed Middleware
Michele Ruta (Politecnico di Bari), Floriano Scioscia (Politecnico di Bari), Eliana Bove (Politecnico di Bari), Annarita Cinquepalmi (Politecnico di Bari), Eugenio Di Sciascio (Politecnico di Bari)
This paper presents a knowledge-based approach for resource discovery, allotment and sharing in distributed pervasive scenarios. The proposed framework enables semantic-based resource retrieval exploiting non-standard inference services and a novel method for ontology dissemination and on-the-fly reconstruction. The approach can augment any publish/subscribe message-oriented middleware. A prototype was implemented and tested to prove correctness of the approach and get early performance evaluations.
RConnected - a middleware for Mobile Services in IoT Environments
Miguel Almeida Carvalho (Universidade de Lisboa), João Nuno Silva (Universidade de Lisboa)
With the advent of the Internet of Things and the availability of devices with low processing power, the management, communication and programming efforts face new requirements. In this paper, we present RConnected, a middleware that allows autonomous interaction of those devices with users, allowing the provisioning of services and facilitating the development of mobile applications.
Writing a distributed computing application in 7 minutes with Tasklets
Dominik Schäfer (Universität Mannheim), Janick Edinger (Universität Mannheim), Martin Breitbach (Universität Mannheim), Christian Becker (Universität Mannheim)
This demo paper introduces a middleware for distributed computation applications - the Tasklet system. The Tasklet system allows developers to execute self-contained units of computation - the so-called Tasklets - in a pool of heterogeneous computing devices, including desktop computers, cloud resources, mobile devices, and graphical processing units. In this demonstration of the Tasklet system, we visualize the otherwise transparent process of computation offloading, starting from the development of an application until the actual distributed execution of tasks. While existing systems have high setup costs the Tasklet system emphasizes the ease of use and a seamless integration of various heterogeneous devices. In the demonstration, we focus on three key benefits of the Tasklet system. First, we demonstrate the usability of the system by live developing a distributed computing application in less than ten minutes. Second, we show how heterogeneous devices can be set up and join the resource pool during the execution of Tasklets. With a monitoring tool we visualize how the computational workload is split up among these resources. Third, we introduce the concept of quality of computation to tailor the otherwise generic computing framework to the requirements of individual applications.
MOS: A Bandwidth-Efficient Cross-Platform Middleware for Publish/Subscribe
Christoph Doblander (Technische Universität München), Simon Zimmermann (Technische Universität München), Kaiwen Zhang (Technische Universität München), Hans-Arno Jacobsen (Technische Universität München)
Shared dictionary compression is known as an efficient compression method for pub/sub. In practice, bandwidth reductions of more than 80% are achievable for JSON or XML data formats. Compared to other compression techniques such as GZip or Deflate, a dictionary is needed to compress and decompress messages. Generating a dictionary is a CPU-expensive task and sharing it introduces bandwidth overheads. Furthermore, the dictionary is continuously maintained to keep the compression performance high. We developed MOS: a cross-platform middleware for managing shared dictionary compression tasks. This includes dictionary propagation, compression/decompression, and periodic maintenance. We provide a developer API to interact with the MQTT-based pub/sub infrastructure. Our demo shows an example application built on top of MOS which shows the performance of the shared dictionary compression scheme.
Nathaniel Wendt (The University of Texas at Austin), Christine Julien (The University of Texas at Austin)
SpatioTemporal Traveler is a mobile travel companion application for Android that showcases the contributions of the PACO middleware. PACO, motivated by increasing privacy concerns, onloads and indexes spatiotemporal context information and provides a query interface that gives users control over data dissemination. SpatioTemporal Traveler allows users to indicate places of interest and passively collects observation data points as travelers move about their tourist destinations. Users can then retrieve views depicting their coverage or knowledge of their target destinations or perform custom queries through the app’s map interface. SpatioTemporal Traveler also includes a desktop visualization that demonstrates how the underlying data structure within PACO can be tuned to adjust the size and lossiness of the structure.
SEMCOMM: Sharing Electronic Medical Records using Device to Device Communication
Tomasz Kalbarczyk (The University of Texas at Austin), Christine Julien (The University of Texas at Austin)
This demonstration showcases SEMComm, an Android application that allows an individual patient's personal device (e.g., smartphone) to collect health data from nearby medical IoT devices and to share pieces of medical records with the devices of nearby medical personnel (e.g., doctors and nurses) using direct device-to-device (D2D) links. SEMComm uses XD, a middleware that enables device discovery, context sharing, and data transmission using heterogeneous D2D communication technologies. Current approaches for sharing electronic medical records use onerous HIPAA-compliant cloud-based solutions that are costly for hospitals and require patients to release sensitive medical records to an external server. SEMComm allows patients to maintain fine-grained control over who has access to their electronic medical data, while simultaneously allowing the patient's record to collect data from multiple medical devices all without the need for an external network or cloud storage. Our demonstration shows how XD enables SEMComm to collect data from a blood pressure cuff and a heart-rate monitor and then to share medical data with neighboring devices using a mixed set of D2D communication links.
Privacy-enhancing Federated Middleware for the Internet of Things
Paul Fremantle (University of Portsmouth), Benjamin Aziz (University of Portsmouth)
OAuthing and IGNITE are federated middleware that together provide an improved model for sharing data from Internet of Things (IoT) devices to Cloud Services. OAuthing provides an identity broker and authorization server that issues OAuth2 anonymised credentials based on upstream identities from the User's Identity Provider. IGNITE is a cloud-based message router that uses identity and authorization policies from OAuthing to instantiate each user a cloud-based container for sharing their data. The demonstration will show the post-manufacturing process to register a device with OAuthing, together with a user ``claiming'' a device. Once the device is claimed it will be connected to a third-party cloud service, with full user consent to create an policy. We will then demonstrate the policy in action and the creation of a cloud-instance on behalf of the user.