Fast and Scalable Simulation Framework for Large In-Order Chip Multiprocessors
As chip technology advances, the number of cores in mainstream chip multiprocessors (CMP) increases, so chips with hundreds of cores may become common within a decade. One of the challenges this trend sets to computer architects is to make the current CMP designs scalable to larger numbers of cores....
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
FRUCT
2020-04-01
|
Series: | Proceedings of the XXth Conference of Open Innovations Association FRUCT |
Subjects: | |
Online Access: | https://www.fruct.org/publications/fruct26/files/Ned.pdf |
id |
doaj-7482325596c74ed796080abb9738a22b |
---|---|
record_format |
Article |
spelling |
doaj-7482325596c74ed796080abb9738a22b2020-11-25T03:38:31ZengFRUCTProceedings of the XXth Conference of Open Innovations Association FRUCT2305-72542343-07372020-04-0126133534510.23919/FRUCT48808.2020.9087481Fast and Scalable Simulation Framework for Large In-Order Chip MultiprocessorsYuri Nedbailo0MCST, RussiaAs chip technology advances, the number of cores in mainstream chip multiprocessors (CMP) increases, so chips with hundreds of cores may become common within a decade. One of the challenges this trend sets to computer architects is to make the current CMP designs scalable to larger numbers of cores. A tool set that would allow us to predict how various design decisions may affect the performance of larger CMPs is therefore necessary. In this paper, we present a trace-based simulation framework we devised for Elbrus microprocessor family. Its core component, the CMP simulator is scalable to at least one thousand of cores and allows to evaluate the kilo-core CMP performance in just a few days using a mainstream 16-core host computer. It is also highly flexible and architecture-agnostic and, therefore, could be used to simulate other in-order architectures. We validated the framework against a real machine and achieved an average accuracy of 18 percent in single-core tests and 15 percent in four-core, an average error in relative slowdown evaluation of 2.6 percent, and average absolute errors in L2 and L3 cache miss rates within 0.3 bytes per cycle.https://www.fruct.org/publications/fruct26/files/Ned.pdfchip multi-processorstrace-based simulationkilo-core |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Yuri Nedbailo |
spellingShingle |
Yuri Nedbailo Fast and Scalable Simulation Framework for Large In-Order Chip Multiprocessors Proceedings of the XXth Conference of Open Innovations Association FRUCT chip multi-processors trace-based simulation kilo-core |
author_facet |
Yuri Nedbailo |
author_sort |
Yuri Nedbailo |
title |
Fast and Scalable Simulation Framework for Large In-Order Chip Multiprocessors |
title_short |
Fast and Scalable Simulation Framework for Large In-Order Chip Multiprocessors |
title_full |
Fast and Scalable Simulation Framework for Large In-Order Chip Multiprocessors |
title_fullStr |
Fast and Scalable Simulation Framework for Large In-Order Chip Multiprocessors |
title_full_unstemmed |
Fast and Scalable Simulation Framework for Large In-Order Chip Multiprocessors |
title_sort |
fast and scalable simulation framework for large in-order chip multiprocessors |
publisher |
FRUCT |
series |
Proceedings of the XXth Conference of Open Innovations Association FRUCT |
issn |
2305-7254 2343-0737 |
publishDate |
2020-04-01 |
description |
As chip technology advances, the number of cores in mainstream chip multiprocessors (CMP) increases, so chips with hundreds of cores may become common within a decade. One of the challenges this trend sets to computer architects is to make the current CMP designs scalable to larger numbers of cores. A tool set that would allow us to predict how various design decisions may affect the performance of larger CMPs is therefore necessary. In this paper, we present a trace-based simulation framework we devised for Elbrus microprocessor family. Its core component, the CMP simulator is scalable to at least one thousand of cores and allows to evaluate the kilo-core CMP performance in just a few days using a mainstream 16-core host computer. It is also highly flexible and architecture-agnostic and, therefore, could be used to simulate other in-order architectures. We validated the framework against a real machine and achieved an average accuracy of 18 percent in single-core tests and 15 percent in four-core, an average error in relative slowdown evaluation of 2.6 percent, and average absolute errors in L2 and L3 cache miss rates within 0.3 bytes per cycle. |
topic |
chip multi-processors trace-based simulation kilo-core |
url |
https://www.fruct.org/publications/fruct26/files/Ned.pdf |
work_keys_str_mv |
AT yurinedbailo fastandscalablesimulationframeworkforlargeinorderchipmultiprocessors |
_version_ |
1724541961680977920 |