Efficient and Flexible Characterization of Data Locality through Native Execution Sampling

Data locality is central to modern computer designs. The widening gap between processor speed and memory latency has introduced the need for a deep hierarchy of caches. Thus, the performance of an application is to a large extent dependent on the amount of data locality the caches can exploit. Some...

Full description

Bibliographic Details
Main Author: Berg, Erik
Format: Doctoral Thesis
Language:English
Published: Uppsala universitet, Avdelningen för datorteknik 2005
Subjects:
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-6012
http://nbn-resolving.de/urn:isbn:91-554-6363-0
id ndltd-UPSALLA1-oai-DiVA.org-uu-6012
record_format oai_dc
spelling ndltd-UPSALLA1-oai-DiVA.org-uu-60122013-01-08T13:07:08ZEfficient and Flexible Characterization of Data Locality through Native Execution SamplingengBerg, ErikUppsala universitet, Avdelningen för datorteknikUppsala universitet, DatorteknikUppsala : Acta Universitatis Upsaliensis2005Computer scienceDatavetenskapData locality is central to modern computer designs. The widening gap between processor speed and memory latency has introduced the need for a deep hierarchy of caches. Thus, the performance of an application is to a large extent dependent on the amount of data locality the caches can exploit. Some data locality comes naturally from the way most programs are written and the way their data is allocated in the memory. Compilers further try to create data locality by loop transformations and optimized data layout. Different ways of writing a program and/or laying out its data may improve an application’s locality even more. However, it is far from obvious how such a locality optimization can be achieved, especially since the optimizing compiler may have left the optimization job half done. Thus, efficient tools are needed to guide the software developers on their quest for data locality. The main contribution of this dissertation is a sample-based novel method for analyzing the data locality of an application. Very sparse data is collected during a single execution of the studied application. The sparse sampling adds a minimum overhead to the execution time, which enables complex applications running realistic data sets to be studied. The architecturalindependent information collected during the execution is fed to a mathematical cache model for predicting the cache miss ratio. The sparsely-collected data can be used to characterize the application’s data locality in respect to almost any possible cache hierarchy, such as complicated multiprocessor memory systems with multilevel cache hierarchies. Any combination of cache size, cache line size and degree of sharing can be modeled. Each new modeled design point takes only a fraction of a second to evaluate, even though the application from which the sampled data was collected may have executed for hours. This makes the tool not just usable for software developers, but also for hardware developers who need to evaluate a huge memory-system design space. We also discuss different ways of presenting data-locality information to a programmer in an intuitive and easily interpreted way. Some of the locality metrics we introduce utilize the flexibility of our algorithm and its ability to vary different cache parameters for one run. The dissertation also presents several prototype implementations of tools for profiling the memory system. Doctoral thesis, comprehensive summaryinfo:eu-repo/semantics/doctoralThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-6012urn:isbn:91-554-6363-0Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, 1651-6214 ; 101application/pdfinfo:eu-repo/semantics/openAccess
collection NDLTD
language English
format Doctoral Thesis
sources NDLTD
topic Computer science
Datavetenskap
spellingShingle Computer science
Datavetenskap
Berg, Erik
Efficient and Flexible Characterization of Data Locality through Native Execution Sampling
description Data locality is central to modern computer designs. The widening gap between processor speed and memory latency has introduced the need for a deep hierarchy of caches. Thus, the performance of an application is to a large extent dependent on the amount of data locality the caches can exploit. Some data locality comes naturally from the way most programs are written and the way their data is allocated in the memory. Compilers further try to create data locality by loop transformations and optimized data layout. Different ways of writing a program and/or laying out its data may improve an application’s locality even more. However, it is far from obvious how such a locality optimization can be achieved, especially since the optimizing compiler may have left the optimization job half done. Thus, efficient tools are needed to guide the software developers on their quest for data locality. The main contribution of this dissertation is a sample-based novel method for analyzing the data locality of an application. Very sparse data is collected during a single execution of the studied application. The sparse sampling adds a minimum overhead to the execution time, which enables complex applications running realistic data sets to be studied. The architecturalindependent information collected during the execution is fed to a mathematical cache model for predicting the cache miss ratio. The sparsely-collected data can be used to characterize the application’s data locality in respect to almost any possible cache hierarchy, such as complicated multiprocessor memory systems with multilevel cache hierarchies. Any combination of cache size, cache line size and degree of sharing can be modeled. Each new modeled design point takes only a fraction of a second to evaluate, even though the application from which the sampled data was collected may have executed for hours. This makes the tool not just usable for software developers, but also for hardware developers who need to evaluate a huge memory-system design space. We also discuss different ways of presenting data-locality information to a programmer in an intuitive and easily interpreted way. Some of the locality metrics we introduce utilize the flexibility of our algorithm and its ability to vary different cache parameters for one run. The dissertation also presents several prototype implementations of tools for profiling the memory system.
author Berg, Erik
author_facet Berg, Erik
author_sort Berg, Erik
title Efficient and Flexible Characterization of Data Locality through Native Execution Sampling
title_short Efficient and Flexible Characterization of Data Locality through Native Execution Sampling
title_full Efficient and Flexible Characterization of Data Locality through Native Execution Sampling
title_fullStr Efficient and Flexible Characterization of Data Locality through Native Execution Sampling
title_full_unstemmed Efficient and Flexible Characterization of Data Locality through Native Execution Sampling
title_sort efficient and flexible characterization of data locality through native execution sampling
publisher Uppsala universitet, Avdelningen för datorteknik
publishDate 2005
url http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-6012
http://nbn-resolving.de/urn:isbn:91-554-6363-0
work_keys_str_mv AT bergerik efficientandflexiblecharacterizationofdatalocalitythroughnativeexecutionsampling
_version_ 1716509442488401920