|
|
|
|
LEADER |
02187 am a22002533u 4500 |
001 |
95648 |
042 |
|
|
|a dc
|
100 |
1 |
0 |
|a Beckmann, Nathan Zachary
|e author
|
100 |
1 |
0 |
|a Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
|e contributor
|
100 |
1 |
0 |
|a Beckmann, Nathan Zachary
|e contributor
|
100 |
1 |
0 |
|a Tsai, Po-An
|e contributor
|
100 |
1 |
0 |
|a Sanchez, Daniel
|e contributor
|
700 |
1 |
0 |
|a Tsai, Po-An
|e author
|
700 |
1 |
0 |
|a Sanchez, Daniel
|e author
|
245 |
0 |
0 |
|a Scaling Distributed Cache Hierarchies through Computation and Data Co-Scheduling
|
260 |
|
|
|b Institute of Electrical and Electronics Engineers (IEEE),
|c 2015-02-26T13:37:58Z.
|
856 |
|
|
|z Get fulltext
|u http://hdl.handle.net/1721.1/95648
|
520 |
|
|
|a Cache hierarchies are increasingly non-uniform, so for systems to scale efficiently, data must be close to the threads that use it. Moreover, cache capacity is limited and contended among threads, introducing complex capacity/latency tradeoffs. Prior NUCA schemes have focused on managing data to reduce access latency, but have ignored thread placement; and applying prior NUMA thread placement schemes to NUCA is inefficient, as capacity, not bandwidth, is the main constraint. We present CDCS, a technique to jointly place threads and data in multicores with distributed shared caches. We develop novel monitoring hardware that enables fine-grained space allocation on large caches, and data movement support to allow frequent full-chip reconfigurations. On a 64-core system, CDCS outperforms an S-NUCA LLC by 46% on average (up to 76%) in weighted speedup and saves 36% of system energy. CDCS also outperforms state-of-the-art NUCA schemes under different thread scheduling policies.
|
520 |
|
|
|a National Science Foundation (U.S.) (Grant CCF-1318384)
|
520 |
|
|
|a Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science (Jacobs Presidential Fellowship)
|
520 |
|
|
|a United States. Defense Advanced Research Projects Agency (PERFECT Contract HR0011-13-2-0005)
|
546 |
|
|
|a en_US
|
655 |
7 |
|
|a Article
|
773 |
|
|
|t Proceedings of the 21st IEEE Symposium on High Performance Computer Architecture
|