Characterizing communication and page usage of parallel applications for thread and data mapping

Abstract : The parallelism in shared-memory systems has increased significantly with the advent and evolution of multicore processors. Current systems include several multicore and multithreaded processors with Non-Uniform Memory Access (NUMA) characteristics. These architectures require the adoption of two strategies for the efficient execution of parallel applications: (i) threads sharing data should be placed in such a way in the memory hierarchy that they execute on shared caches; and (ii) a thread should have the data that it accesses placed on the NUMA node where it is executing. We refer to these techniques as thread and data mapping, respectively. Both strategies require knowledge of the application’s memory access behavior to identify the communication between threads and processes as well as their usage of memory pages. In this paper, we introduce a profiling method to establish the suitability of parallel applications for improved mappings that take the memory hierarchy into account, based on a mathematical description of their memory access behaviors. Experiments with a large set of parallel workloads that are based on a variety of parallel APIs (MPI, OpenMP, Pthreads, and MPI+OpenMP) show that most applications can benefit from improved mappings. We provide a mechanism to compute optimized thread and data mappings. Experimental results with this mechanism showed performance improvements of up to 54% (20% on average), as well as reductions of the energy consumption of up to 37% (11% on average), compared to the default mapping by the operating system. Furthermore, our results show that thread and data mapping have to be performed jointly in order to achieve optimal improvements.
Type de document :
Article dans une revue
Performance Evaluation, Elsevier, 2015, 88-89, pp.18-36. 〈10.1016/j.peva.2015.03.001〉
Liste complète des métadonnées
Contributeur : Fabrice Dupros <>
Soumis le : mercredi 29 avril 2015 - 11:40:55
Dernière modification le : vendredi 12 juin 2015 - 10:58:31




Matthias Diener, Eduardo Cruz, Laércio L. Pilla, Fabrice Dupros, Philippe Olivier Alexandre Navaux. Characterizing communication and page usage of parallel applications for thread and data mapping. Performance Evaluation, Elsevier, 2015, 88-89, pp.18-36. 〈10.1016/j.peva.2015.03.001〉. 〈hal-01146859〉



Consultations de la notice