Leveraging Parallel Data Processing Frameworks with Verified Lifting

Many parallel data frameworks have been proposed in recent years that let sequential programs access parallel processing. To capitalize on the benefits of such frameworks, existing code must often be rewritten to the domain-specific languages that each framework supports. This rewriting–tedious and...

Full description

Bibliographic Details
Main Authors:	Maaz Bin Safeer Ahmad, Alvin Cheung
Format:	Article
Language:	English
Published:	Open Publishing Association 2016-11-01
Series:	Electronic Proceedings in Theoretical Computer Science
Online Access:	http://arxiv.org/pdf/1611.07623v1

id	doaj-8c18456d892640838bd9d2d7dd086549
record_format	Article
spelling	doaj-8c18456d892640838bd9d2d7dd0865492020-11-25T00:12:45ZengOpen Publishing AssociationElectronic Proceedings in Theoretical Computer Science2075-21802016-11-01229Proc. SYNT 2016678310.4204/EPTCS.229.7:11Leveraging Parallel Data Processing Frameworks with Verified LiftingMaaz Bin Safeer AhmadAlvin CheungMany parallel data frameworks have been proposed in recent years that let sequential programs access parallel processing. To capitalize on the benefits of such frameworks, existing code must often be rewritten to the domain-specific languages that each framework supports. This rewriting–tedious and error-prone–also requires developers to choose the framework that best optimizes performance given a specific workload. This paper describes Casper, a novel compiler that automatically retargets sequential Java code for execution on Hadoop, a parallel data processing framework that implements the MapReduce paradigm. Given a sequential code fragment, Casper uses verified lifting to infer a high-level summary expressed in our program specification language that is then compiled for execution on Hadoop. We demonstrate that Casper automatically translates Java benchmarks into Hadoop. The translated results execute on average 3.3x faster than the sequential implementations and scale better, as well, to larger datasets.http://arxiv.org/pdf/1611.07623v1
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Maaz Bin Safeer Ahmad Alvin Cheung
spellingShingle	Maaz Bin Safeer Ahmad Alvin Cheung Leveraging Parallel Data Processing Frameworks with Verified Lifting Electronic Proceedings in Theoretical Computer Science
author_facet	Maaz Bin Safeer Ahmad Alvin Cheung
author_sort	Maaz Bin Safeer Ahmad
title	Leveraging Parallel Data Processing Frameworks with Verified Lifting
title_short	Leveraging Parallel Data Processing Frameworks with Verified Lifting
title_full	Leveraging Parallel Data Processing Frameworks with Verified Lifting
title_fullStr	Leveraging Parallel Data Processing Frameworks with Verified Lifting
title_full_unstemmed	Leveraging Parallel Data Processing Frameworks with Verified Lifting
title_sort	leveraging parallel data processing frameworks with verified lifting
publisher	Open Publishing Association
series	Electronic Proceedings in Theoretical Computer Science
issn	2075-2180
publishDate	2016-11-01
description	Many parallel data frameworks have been proposed in recent years that let sequential programs access parallel processing. To capitalize on the benefits of such frameworks, existing code must often be rewritten to the domain-specific languages that each framework supports. This rewriting–tedious and error-prone–also requires developers to choose the framework that best optimizes performance given a specific workload. This paper describes Casper, a novel compiler that automatically retargets sequential Java code for execution on Hadoop, a parallel data processing framework that implements the MapReduce paradigm. Given a sequential code fragment, Casper uses verified lifting to infer a high-level summary expressed in our program specification language that is then compiled for execution on Hadoop. We demonstrate that Casper automatically translates Java benchmarks into Hadoop. The translated results execute on average 3.3x faster than the sequential implementations and scale better, as well, to larger datasets.
url	http://arxiv.org/pdf/1611.07623v1
work_keys_str_mv	AT maazbinsafeerahmad leveragingparalleldataprocessingframeworkswithverifiedlifting AT alvincheung leveragingparalleldataprocessingframeworkswithverifiedlifting
_version_	1725397645429571584

Leveraging Parallel Data Processing Frameworks with Verified Lifting

Similar Items