Show simple item record

dc.contributor.authorFeilbach, Chris
dc.contributor.authorSperling, Adam
dc.contributor.authorSifakis, Eftychios
dc.contributor.authorHill, Mark D.
dc.date.accessioned2016-05-20T20:55:37Z
dc.date.available2016-05-20T20:55:37Z
dc.date.issued2016-05-20
dc.identifier.citationTR1834en
dc.identifier.urihttp://digital.library.wisc.edu/1793/74898
dc.description.abstractScientific computing workloads are well suited to parallel accelerators such as GPGPUs and the Intel Xeon Phi. While these accelerators can provide greater performance than traditional CPUs due to their parallel architectures and greater memory bandwidth, their maximum workload size is limited by relatively small memory capacity. To solve this problem, data can be split across multiple accelerators to utilize the combined memory capacity as well as increased compute capability. Combining multiple accelerators into heterogeneous systems introduces a new bottleneck. Communication bandwidth between accelerators over the PCIe interconnect is much slower than internal memory bandwidth. This project examines the inter-node bandwidth bottleneck using the Intel Xeon Phi in the context of scientific applications. We show the limitations of traditional MPI programming paradigms, and leverage Intel?s Xeon Phi-specific SCIF communication API to achieve increased inter-node memory bandwidth. While small messages still incur communication overhead penalties, messages larger than 512KB are able to saturate the PCIe bus and achieve bandwidth utilization close to 90% of the theoretical maximum. This project also attempts to address the complexities of programming systems of multiple accelerators. We introduce a software interface wrapper over SCIF that coalesces groups of small messages into larger ones. This new interface eases the programming experience and provides greater interconnect bandwidth from coalescing.en
dc.subjectBandwidthen
dc.subjectAcceleratorsen
dc.subjectScientific Workloadsen
dc.subjectPCIEen
dc.subjectSCIFen
dc.subjectXeon Phien
dc.titleProgramming Heterogeneous Computers and Improving Inter-Node Communication Across Xeon Phisen
dc.typeTechnical Reporten


Files in this item

Thumbnail
Thumbnail
Thumbnail
Thumbnail

This item appears in the following Collection(s)

  • CS Technical Reports
    Technical Reports Archive for the Department of Computer Sciences at the University of Wisconsin-Madison

Show simple item record