ECiDA: Evolutionary Changes in Data Analysis

Frank Blaauw, Roy Overbeek, Toon Albers, Jeroen Vlek, Mario Maessen, Jan Gooijer, Elena Lazovik, Farhad Arbab, Alexander Lazovik

Research output: Contribution to conferencePosterAcademic

87 Downloads (Pure)

Abstract

Modern data analysis platforms all too often rely on the fact that the
application and underlying data flow are static. That is, such platforms
generally do not implement the capabilities to update individual components
of running pipelines without restarting the pipeline, and they rely
on data sources to remain unchanged while they are being used. However,
in reality these assumptions do not hold: data scientists come up with
new methods to analyze data all the time, and data sources are almost by
definition dynamic. Companies performing data science analyses either
need to accept the fact that their pipeline goes down during an update,
or they should run a duplicate setup of their often costly infrastructure
that continues the pipeline operations.
In this research we present the Evolutionary Changes in Data Analysis
(ECiDA) platform, with which we show how evolution and data science
can go hand in hand. ECiDA aims to bridge the gap that is present between
engineers that build large scale computation platforms on the one
hand, and data scientists that perform analyses on large quantities of data
on the other, while making change a first-class citizen. ECiDA allows data
scientists to build their data science pipelines on scalable infrastructures,
and make changes to them while they remain up and running. Such
changes can range from parameter changes in individual pipeline components
to general changes in network topology. Changes may also be
initiated by an ECiDA pipeline itself as part of a diagnostic response: for
instance, it may dynamically replace a data source that has become unavailable
with one that is available. To make sure the platform remains in
a consistent state while performing these updates, ECiDA uses a set of automatic
formal verification methods, such as constraint programming and
AI planning, to transparently check the validity of updates and prevent
undesired behavior.
Original languageEnglish
DOIs
Publication statusPublished - 31-Mar-2019
EventICT.Open - Hilversum, Netherlands
Duration: 19-Mar-201920-Mar-2019
https://www.nwo.nl/actueel/evenementen/ict+open

Conference

ConferenceICT.Open
Country/TerritoryNetherlands
CityHilversum
Period19/03/201920/03/2019
Internet address

Cite this