Thread-Aware Mechanism to Enhance Inter-Node Load Balancing for Multithreaded Applications on NUMA Systems
NUMA multi-core systems divide system resources into several nodes. When an imbalance in the load between cores occurs, the kernel scheduler’s load balancing mechanism then migrates threads between cores or across NUMA nodes. Remote memory access is required for a thread to access memory on the prev...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-07-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/11/14/6486 |
id |
doaj-b00008107ac4495c9143083a04e2dfcc |
---|---|
record_format |
Article |
spelling |
doaj-b00008107ac4495c9143083a04e2dfcc2021-07-23T13:29:46ZengMDPI AGApplied Sciences2076-34172021-07-01116486648610.3390/app11146486Thread-Aware Mechanism to Enhance Inter-Node Load Balancing for Multithreaded Applications on NUMA SystemsMei-Ling Chiang0Wei-Lun Su1Department of Information Management, National Chi Nan University, Puli 54516, TaiwanDepartment of Information Management, National Chi Nan University, Puli 54516, TaiwanNUMA multi-core systems divide system resources into several nodes. When an imbalance in the load between cores occurs, the kernel scheduler’s load balancing mechanism then migrates threads between cores or across NUMA nodes. Remote memory access is required for a thread to access memory on the previous node, which degrades performance. Threads to be migrated must be selected effectively and efficiently since the related operations run in the critical path of the kernel scheduler. This study focuses on improving inter-node load balancing for multithreaded applications. We propose a thread-aware selection policy that considers the distribution of threads on nodes for each thread group while migrating one thread for inter-node load balancing. The thread is selected for which its thread group has the least exclusive thread distribution, and thread members are distributed more evenly on nodes. This has less influence on data mapping and thread mapping for the thread group. We further devise several enhancements to eliminate superfluous evaluations for multithreaded processes, so the selection procedure is more efficient. The experimental results for the commonly used PARSEC 3.0 benchmark suite show that the modified Linux kernel with the proposed selection policy increases performance by 10.7% compared with the unmodified Linux kernel.https://www.mdpi.com/2076-3417/11/14/6486NUMALinux kernelmultithreadedload balancingremote memory access |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Mei-Ling Chiang Wei-Lun Su |
spellingShingle |
Mei-Ling Chiang Wei-Lun Su Thread-Aware Mechanism to Enhance Inter-Node Load Balancing for Multithreaded Applications on NUMA Systems Applied Sciences NUMA Linux kernel multithreaded load balancing remote memory access |
author_facet |
Mei-Ling Chiang Wei-Lun Su |
author_sort |
Mei-Ling Chiang |
title |
Thread-Aware Mechanism to Enhance Inter-Node Load Balancing for Multithreaded Applications on NUMA Systems |
title_short |
Thread-Aware Mechanism to Enhance Inter-Node Load Balancing for Multithreaded Applications on NUMA Systems |
title_full |
Thread-Aware Mechanism to Enhance Inter-Node Load Balancing for Multithreaded Applications on NUMA Systems |
title_fullStr |
Thread-Aware Mechanism to Enhance Inter-Node Load Balancing for Multithreaded Applications on NUMA Systems |
title_full_unstemmed |
Thread-Aware Mechanism to Enhance Inter-Node Load Balancing for Multithreaded Applications on NUMA Systems |
title_sort |
thread-aware mechanism to enhance inter-node load balancing for multithreaded applications on numa systems |
publisher |
MDPI AG |
series |
Applied Sciences |
issn |
2076-3417 |
publishDate |
2021-07-01 |
description |
NUMA multi-core systems divide system resources into several nodes. When an imbalance in the load between cores occurs, the kernel scheduler’s load balancing mechanism then migrates threads between cores or across NUMA nodes. Remote memory access is required for a thread to access memory on the previous node, which degrades performance. Threads to be migrated must be selected effectively and efficiently since the related operations run in the critical path of the kernel scheduler. This study focuses on improving inter-node load balancing for multithreaded applications. We propose a thread-aware selection policy that considers the distribution of threads on nodes for each thread group while migrating one thread for inter-node load balancing. The thread is selected for which its thread group has the least exclusive thread distribution, and thread members are distributed more evenly on nodes. This has less influence on data mapping and thread mapping for the thread group. We further devise several enhancements to eliminate superfluous evaluations for multithreaded processes, so the selection procedure is more efficient. The experimental results for the commonly used PARSEC 3.0 benchmark suite show that the modified Linux kernel with the proposed selection policy increases performance by 10.7% compared with the unmodified Linux kernel. |
topic |
NUMA Linux kernel multithreaded load balancing remote memory access |
url |
https://www.mdpi.com/2076-3417/11/14/6486 |
work_keys_str_mv |
AT meilingchiang threadawaremechanismtoenhanceinternodeloadbalancingformultithreadedapplicationsonnumasystems AT weilunsu threadawaremechanismtoenhanceinternodeloadbalancingformultithreadedapplicationsonnumasystems |
_version_ |
1721289578027941888 |