Leveraging Parallel Data Processing Frameworks with Verified Lifting
Many parallel data frameworks have been proposed in recent years that let sequential programs access parallel processing. To capitalize on the benefits of such frameworks, existing code must often be rewritten to the domain-specific languages that each framework supports. This rewriting–tedious and...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Open Publishing Association
2016-11-01
|
Series: | Electronic Proceedings in Theoretical Computer Science |
Online Access: | http://arxiv.org/pdf/1611.07623v1 |
id |
doaj-8c18456d892640838bd9d2d7dd086549 |
---|---|
record_format |
Article |
spelling |
doaj-8c18456d892640838bd9d2d7dd0865492020-11-25T00:12:45ZengOpen Publishing AssociationElectronic Proceedings in Theoretical Computer Science2075-21802016-11-01229Proc. SYNT 2016678310.4204/EPTCS.229.7:11Leveraging Parallel Data Processing Frameworks with Verified LiftingMaaz Bin Safeer AhmadAlvin CheungMany parallel data frameworks have been proposed in recent years that let sequential programs access parallel processing. To capitalize on the benefits of such frameworks, existing code must often be rewritten to the domain-specific languages that each framework supports. This rewriting–tedious and error-prone–also requires developers to choose the framework that best optimizes performance given a specific workload. This paper describes Casper, a novel compiler that automatically retargets sequential Java code for execution on Hadoop, a parallel data processing framework that implements the MapReduce paradigm. Given a sequential code fragment, Casper uses verified lifting to infer a high-level summary expressed in our program specification language that is then compiled for execution on Hadoop. We demonstrate that Casper automatically translates Java benchmarks into Hadoop. The translated results execute on average 3.3x faster than the sequential implementations and scale better, as well, to larger datasets.http://arxiv.org/pdf/1611.07623v1 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Maaz Bin Safeer Ahmad Alvin Cheung |
spellingShingle |
Maaz Bin Safeer Ahmad Alvin Cheung Leveraging Parallel Data Processing Frameworks with Verified Lifting Electronic Proceedings in Theoretical Computer Science |
author_facet |
Maaz Bin Safeer Ahmad Alvin Cheung |
author_sort |
Maaz Bin Safeer Ahmad |
title |
Leveraging Parallel Data Processing Frameworks with Verified Lifting |
title_short |
Leveraging Parallel Data Processing Frameworks with Verified Lifting |
title_full |
Leveraging Parallel Data Processing Frameworks with Verified Lifting |
title_fullStr |
Leveraging Parallel Data Processing Frameworks with Verified Lifting |
title_full_unstemmed |
Leveraging Parallel Data Processing Frameworks with Verified Lifting |
title_sort |
leveraging parallel data processing frameworks with verified lifting |
publisher |
Open Publishing Association |
series |
Electronic Proceedings in Theoretical Computer Science |
issn |
2075-2180 |
publishDate |
2016-11-01 |
description |
Many parallel data frameworks have been proposed in recent years that let sequential programs access parallel processing. To capitalize on the benefits of such frameworks, existing code must often be rewritten to the domain-specific languages that each framework supports. This rewriting–tedious and error-prone–also requires developers to choose the framework that best optimizes performance given a specific workload.
This paper describes Casper, a novel compiler that automatically retargets sequential Java code for execution on Hadoop, a parallel data processing framework that implements the MapReduce paradigm. Given a sequential code fragment, Casper uses verified lifting to infer a high-level summary expressed in our program specification language that is then compiled for execution on Hadoop. We demonstrate that Casper automatically translates Java benchmarks into Hadoop. The translated results execute on average 3.3x faster than the sequential implementations and scale better, as well, to larger datasets. |
url |
http://arxiv.org/pdf/1611.07623v1 |
work_keys_str_mv |
AT maazbinsafeerahmad leveragingparalleldataprocessingframeworkswithverifiedlifting AT alvincheung leveragingparalleldataprocessingframeworkswithverifiedlifting |
_version_ |
1725397645429571584 |