Fusing GPU kernels within a novel single-source C++ API

Ralph Potter, Paul Keir, Jan Lucas, Mauricio Alvarez-Mesa, Ben Juurlink, Andrew Richards

Research output: Contribution to conferencePresentation


The prospect of GPU kernel fusion is often described in research papers as a standalone command-line tool. Such a tool adopts a usage pattern wherein a user isolates, or annotates, an ordered set of kernels. Given such OpenCL C kernels as input, the tool would output a single kernel, which performs similar calculations, hence minimizing costly runtime intermediate load and store operations. Such a mode of operation is, however, a departure from normality for many developers, and is mainly of academic interest.

Automatic compiler-based kernel fusion could provide a vast improvement to the end-user's development experience. The OpenCL Host API, however, does not provide a means to specify opportunities for kernel fusion to the compiler. Ongoing and rapidly maturing compiler and API research by Codeplay aims to provide a higher-level, single-source, industry-focused C++-based interface to OpenCL. Opportunities for kernel fusion have now also been investigated here; utilizing features from C++11 including lambda functions; variadic templates; and lazy evaluation using std::bind expressions.

While pixel-to-pixel transformations are interesting in this context, insomuch as they demonstrate the expressivity of this new single-source C++ API, we also consider fusing transformations which utilize synchronization within workgroups. Hence convolutions, utilizing halos; and the use of the GPU's local shared memory are also explored.

A perennial problem has therefore been restructured to accommodate a modern C++-based expression of kernel fusion. Kernel fusion thus becomes an integrated component of an extended C++ compiler and API.
Original languageEnglish
Publication statusPublished - 18 Nov 2013
Externally publishedYes
EventIntel Compiler, Architecture and Tools Conference 2013 - Intel Office, Haifa, Israel
Duration: 18 Nov 201319 Nov 2013


ConferenceIntel Compiler, Architecture and Tools Conference 2013
Abbreviated titleCATC 2013
Internet address


Dive into the research topics of 'Fusing GPU kernels within a novel single-source C++ API'. Together they form a unique fingerprint.

Cite this