Directoryless shared memory architecture using thread migration and remote access

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2014. === Cataloged from PDF version of thesis. === Includes bibliographical references (pages 101-106). === Chip multiprocessors (CMPs) have become mainstream in recent years, and, for...

Full description

Bibliographic Details
Main Author: Shim, Keun Sup
Other Authors: Srinivas Devadas.
Format: Others
Language:English
Published: Massachusetts Institute of Technology 2014
Subjects:
Online Access:http://hdl.handle.net/1721.1/90001
id ndltd-MIT-oai-dspace.mit.edu-1721.1-90001
record_format oai_dc
spelling ndltd-MIT-oai-dspace.mit.edu-1721.1-900012019-05-02T15:57:22Z Directoryless shared memory architecture using thread migration and remote access Shim, Keun Sup Srinivas Devadas. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2014. Cataloged from PDF version of thesis. Includes bibliographical references (pages 101-106). Chip multiprocessors (CMPs) have become mainstream in recent years, and, for scalability reasons, high-core-count designs tend towards tiled CMPs with physically distributed caches. In order to support shared memory, current many-core CMPs maintain cache coherence using distributed directory protocols, which are extremely difficult and error-prone to implement and verify. Private caches with directory-based coherence also provide suboptimal performance when a thread accesses large amounts of data distributed across the chip: the data must be brought to the core where the thread is running, incurring delays and energy costs. Under this scenario, migrating a thread to data instead of the other way around can improve performance. In this thesis, we propose a directoryless approach where data can be accessed either via a round-trip remote access protocol or by migrating a thread to where data resides. While our hardware mechanism for fine-grained thread migration enables faster migration than previous proposals, its costs still make it crucial to use thread migrations judiciously for the performance of our proposed architecture. We, therefore, present an on-line algorithm which decides at the instruction level whether to perform a remote access or a thread migration. In addition, to further reduce migration costs, we extend our scheme to support partial context migration by predicting the necessary thread context. Finally, we provide the ASIC implementation details as well as RTL simulation results of the Execution Migration Machine (EM² ), a 110-core directoryless shared-memory processor. by Keun Sup Shim. Ph. D. 2014-09-19T21:33:39Z 2014-09-19T21:33:39Z 2014 2014 Thesis http://hdl.handle.net/1721.1/90001 890132747 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 110 pages application/pdf Massachusetts Institute of Technology
collection NDLTD
language English
format Others
sources NDLTD
topic Electrical Engineering and Computer Science.
spellingShingle Electrical Engineering and Computer Science.
Shim, Keun Sup
Directoryless shared memory architecture using thread migration and remote access
description Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2014. === Cataloged from PDF version of thesis. === Includes bibliographical references (pages 101-106). === Chip multiprocessors (CMPs) have become mainstream in recent years, and, for scalability reasons, high-core-count designs tend towards tiled CMPs with physically distributed caches. In order to support shared memory, current many-core CMPs maintain cache coherence using distributed directory protocols, which are extremely difficult and error-prone to implement and verify. Private caches with directory-based coherence also provide suboptimal performance when a thread accesses large amounts of data distributed across the chip: the data must be brought to the core where the thread is running, incurring delays and energy costs. Under this scenario, migrating a thread to data instead of the other way around can improve performance. In this thesis, we propose a directoryless approach where data can be accessed either via a round-trip remote access protocol or by migrating a thread to where data resides. While our hardware mechanism for fine-grained thread migration enables faster migration than previous proposals, its costs still make it crucial to use thread migrations judiciously for the performance of our proposed architecture. We, therefore, present an on-line algorithm which decides at the instruction level whether to perform a remote access or a thread migration. In addition, to further reduce migration costs, we extend our scheme to support partial context migration by predicting the necessary thread context. Finally, we provide the ASIC implementation details as well as RTL simulation results of the Execution Migration Machine (EM² ), a 110-core directoryless shared-memory processor. === by Keun Sup Shim. === Ph. D.
author2 Srinivas Devadas.
author_facet Srinivas Devadas.
Shim, Keun Sup
author Shim, Keun Sup
author_sort Shim, Keun Sup
title Directoryless shared memory architecture using thread migration and remote access
title_short Directoryless shared memory architecture using thread migration and remote access
title_full Directoryless shared memory architecture using thread migration and remote access
title_fullStr Directoryless shared memory architecture using thread migration and remote access
title_full_unstemmed Directoryless shared memory architecture using thread migration and remote access
title_sort directoryless shared memory architecture using thread migration and remote access
publisher Massachusetts Institute of Technology
publishDate 2014
url http://hdl.handle.net/1721.1/90001
work_keys_str_mv AT shimkeunsup directorylesssharedmemoryarchitectureusingthreadmigrationandremoteaccess
_version_ 1719032117423243264