Optimising Across Relational and Linear Algebra in Parallel Analytics Pipelines

18 ottobre 2018
18 ottobre 2018

Date & Time: October 18, 2018 - h.11.30 am
Location: Room B104, Via Sommarive 9 - Polo Ferrari 2 (Povo, TN)

Speaker

  • Asterios Katsifodimos, TU Delft Netherlands 

Abstract

My talk will be split in two parts. In the first part I will briefly introduce a deeply embedded language in Scala, which enables authoring scalable programs using two abstract data types, namely DataBag and Matrix, enabling joint optimizations over both relational and linear algebra. In the second part I will discuss a concrete optimization which can be applied in the context of analysis programs comprising both linear and relational algebra operations. More specifically, I will present BlockJoin, a distributed join algorithm presented in this year's VLDB, which emits block-partitioned results to subsequent linear algebra operations such as matrix multiplications. BlockJoin applies database techniques known from columnar processing, such as index-joins and late materialisation, in the context of parallel data flow engines, in order to minimise very expensive shuffling costs.

About the Speaker

Asterios Katsifodimos is an Assistant Professor at  Delft University of Technology (TU Delft) and a member of the Web Information Systems group. Before joining TU Delft, he spent a year with the SAP Innovation Center in Berlin, working on scale-out architectures for machine learning inference and training. Before SAP, he was a senior researcher at the database systems group in TU Berlin, headed by Volker Markl. He received his PhD from INRIA Saclay & Universite Paris-Sud in 2013, under the supervision of Ioana Manolescu. Prior to that, he was a member of the High Performance Computing systems Lab (HPCL), at the University of Cyprus, working with Marios Dikaiakos.
He works in the broad area of scalable data management; more specifically on Big Data analytics and management, stream processing, database language models, and query optimization. He isthe information director of SIGMOD Record, a local organization chair of SIGMOD 2019 and he regularly serves in the program committees of ICDE/SIGMOD and VLDB.

Contact Person: velgias [at] unitn.it (Yannis Velegrakis)