Efficient Data Stream Sampling on Apache Flink
Sampling is considered to be a core component of data analysis making it possibleto provide a synopsis of possibly large amounts of data by maintainingonly subsets or multisubsets of it. In the context of data streaming, an emergingprocessing paradigm where data is assumed to be unbounded, samplingo...
Main Author: | Vlachou-Konchylaki, Martha |
---|---|
Format: | Others |
Language: | English |
Published: |
KTH, Skolan för datavetenskap och kommunikation (CSC)
2016
|
Subjects: | |
Online Access: | http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-183397 |
Similar Items
-
FlinkCheck: Property-Based Testing for Apache Flink
by: Cristina Valentina Espinosa, et al.
Published: (2019-01-01) -
Towards autoscaling of Apache Flink jobs
by: Varga Balázs, et al.
Published: (2021-06-01) -
SPARQL2Flink: Evaluation of SPARQL Queries on Apache Flink
by: Oscar Ceballos, et al.
Published: (2021-07-01) -
FlinkNDB : Guaranteed Data Streaming Using External State
by: Asif, Muhammad Haseeb
Published: (2021) -
Influencing Factors in the Scalability of Distributed Stream Processing Jobs
by: Giselle Van Dongen, et al.
Published: (2021-01-01)