Decoupling algorithms from schedules for easy optimization of image processing pipelines

Using existing programming tools, writing high-performance image processing code requires sacrificing readability, portability, and modularity. We argue that this is a consequence of conflating what computations define the algorithm, with decisions about storage and the order of computation. We refe...

Full description

Bibliographic Details
Main Authors: Adams, Andrew (Contributor), Paris, Sylvain (Author), Levoy, Marc (Author), Ragan-Kelley, Jonathan Millar (Contributor), Amarasinghe, Saman P. (Contributor), Durand, Fredo (Contributor)
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory (Contributor), Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science (Contributor)
Format: Article
Language:English
Published: Association for Computing Machinery (ACM), 2014-03-28T13:45:27Z.
Subjects:
Online Access:Get fulltext
LEADER 03062 am a22003493u 4500
001 85942
042 |a dc 
100 1 0 |a Adams, Andrew  |e author 
100 1 0 |a Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory  |e contributor 
100 1 0 |a Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science  |e contributor 
100 1 0 |a Ragan-Kelley, Jonathan Millar  |e contributor 
100 1 0 |a Adams, Andrew  |e contributor 
100 1 0 |a Amarasinghe, Saman P.  |e contributor 
100 1 0 |a Durand, Fredo  |e contributor 
700 1 0 |a Paris, Sylvain  |e author 
700 1 0 |a Levoy, Marc  |e author 
700 1 0 |a Ragan-Kelley, Jonathan Millar  |e author 
700 1 0 |a Amarasinghe, Saman P.  |e author 
700 1 0 |a Durand, Fredo  |e author 
245 0 0 |a Decoupling algorithms from schedules for easy optimization of image processing pipelines 
260 |b Association for Computing Machinery (ACM),   |c 2014-03-28T13:45:27Z. 
856 |z Get fulltext  |u http://hdl.handle.net/1721.1/85942 
520 |a Using existing programming tools, writing high-performance image processing code requires sacrificing readability, portability, and modularity. We argue that this is a consequence of conflating what computations define the algorithm, with decisions about storage and the order of computation. We refer to these latter two concerns as the schedule, including choices of tiling, fusion, recomputation vs. storage, vectorization, and parallelism. We propose a representation for feed-forward imaging pipelines that separates the algorithm from its schedule, enabling high-performance without sacrificing code clarity. This decoupling simplifies the algorithm specification: images and intermediate buffers become functions over an infinite integer domain, with no explicit storage or boundary conditions. Imaging pipelines are compositions of functions. Programmers separately specify scheduling strategies for the various functions composing the algorithm, which allows them to efficiently explore different optimizations without changing the algorithmic code. We demonstrate the power of this representation by expressing a range of recent image processing applications in an embedded domain specific language called Halide, and compiling them for ARM, x86, and GPUs. Our compiler targets SIMD units, multiple cores, and complex memory hierarchies. We demonstrate that it can handle algorithms such as a camera raw pipeline, the bilateral grid, fast local Laplacian filtering, and image segmentation. The algorithms expressed in our language are both shorter and faster than state-of-the-art implementations. 
520 |a National Science Foundation (U.S.) (Grant 0964004) 
520 |a National Science Foundation (U.S.) (Grant 0964218) 
520 |a National Science Foundation (U.S.) (Grant 0832997) 
520 |a United States. Dept. of Energy (Award DE-SC0005288) 
520 |a Cognex Corporation 
520 |a Adobe Systems 
546 |a en_US 
655 7 |a Article 
773 |t ACM Transactions on Graphics