Large data set processing needs parallel computing that can be hosted on commodity processors ranging from low-end multi-core CPUs to high-end many-core GPUs. OpenCL can program both CPU and GPU, and even future parallel processors. Readers can learn from the article what workloads each processor is good at.
The article first discusses why OpenCL is a better programming paradigm for parallel computing on heterogeneous processors than others (OpenML,pThread, CUDA and ATI Stream); then compares the major hardware differences between CPU and GPU, and what different workloads CPU and GPU are each good at; finally a brief example is given regarding how to parallel an originally single-threaded algorithm to CPU and GPU using OpenCL.
This article was based on the coding experience with Lili Zhou, a medical physics resident at NY Presbyterian Hospital / Weill Cornel Medical Center. More discussions are expected as we are migrating more single-threaded programs to multi-core CPU and GPGPU.
This article is basically a rerun of my previous post.