Optimizing Concurrency in Heterogeneous Data-Parallel Applications – An Automated System Approach
Keywords:GPU, CUDA, OpenCL, High Performance Computing, Parallel Computing, Concurrency
The existing high-performance computing (HPC) frameworks lie in their suboptimal utilization of the inherent concurrency in data-parallel applications. While these frameworks provide high-level abstractions, their scheduling decisions often fail to fully exploit the potential for concurrency within heterogeneous CPU and GPU architecture. In order to address this limitation, this article proposes a novel framework designed with a philosophy akin to other high-performance computing frameworks but with a distinct emphasis on exploring fine-grained concurrency-aware scheduling decisions. The primary objective is to harness the complete computational power of heterogeneous CPU and GPU architectures.