MacroBase: Prioritizing Attention in Fast Data

As data volumes continue to rise, manual inspection is becoming increasingly untenable. In response, we present MacroBase, a data analytics engine that prioritizes end-user attention in high-volume fast data streams. MacroBase enables efficient, accurate, and modular analyses that highlight and aggr...

Full description

Bibliographic Details
Main Authors: Bailis, Peter (Author), Gan, Edward (Author), Madden, Samuel (Author), Narayanan, Deepak (Author), Rong, Kexin (Author), Suri, Sahaana (Author)
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory (Contributor)
Format: Article
Language:English
Published: Association for Computing Machinery (ACM), 2021-11-08T20:13:09Z.
Subjects:
Online Access:Get fulltext
LEADER 01611 am a22002173u 4500
001 137811
042 |a dc 
100 1 0 |a Bailis, Peter  |e author 
100 1 0 |a Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory  |e contributor 
700 1 0 |a Gan, Edward  |e author 
700 1 0 |a Madden, Samuel  |e author 
700 1 0 |a Narayanan, Deepak  |e author 
700 1 0 |a Rong, Kexin  |e author 
700 1 0 |a Suri, Sahaana  |e author 
245 0 0 |a MacroBase: Prioritizing Attention in Fast Data 
260 |b Association for Computing Machinery (ACM),   |c 2021-11-08T20:13:09Z. 
856 |z Get fulltext  |u https://hdl.handle.net/1721.1/137811 
520 |a As data volumes continue to rise, manual inspection is becoming increasingly untenable. In response, we present MacroBase, a data analytics engine that prioritizes end-user attention in high-volume fast data streams. MacroBase enables efficient, accurate, and modular analyses that highlight and aggregate important and unusual behavior, acting as a search engine for fast data. MacroBase is able to deliver order-of-magnitude speedups over alternatives by optimizing the combination of explanation and classification tasks and by leveraging a new reservoir sampler and heavy-hitters sketch specialized for fast data streams. As a result, MacroBase delivers accurate results at speeds of up to 2M events per second per query on a single core. The system has delivered meaningful results in production, including at a telematics company monitoring hundreds of thousands of vehicles, 
546 |a en 
655 7 |a Article 
773 |t 10.1145/3035918.3035928