Home   >   CSC-OpenAccess Library   >    Manuscript Information
Full Text Available

This is an Open Access publication published under CSC-OpenAccess Policy.
An OpenCL Method of Parallel Sorting Algorithms for GPU Architecture
Krishnahari Thouti, S. R. Sathe
Pages - 1 - 8     |    Revised - 15-07-2012     |    Published - 10-08-2012
Volume - 3   Issue - 1    |    Publication Date - October 2012  Table of Contents
GPU, GPGPU, Parallel Computing, Parallel Sorting Algorithms
In this paper, we present a comparative performance analysis of different parallel sorting algorithms: Bitonic sort and Parallel Radix Sort. In order to study the interaction between the algorithms and architecture, we implemented both the algorithms in OpenCL and compared its performance with Quick Sort algorithm, the fastest algorithm. In our simulation, we have used Intel Core2Duo CPU 2.67GHz and NVidia Quadro FX 3800 as graphical processing unit.
CITED BY (1)  
1 Leitao, Á., & Oosterlee, C. W. (2015). GPU Acceleration of the Stochastic Grid Bundling Method for Early-Exercise options. International Journal of Computer Mathematics, (just-accepted), 1-25.
1 Google Scholar
2 CiteSeerX
3 refSeek
4 Scribd
5 SlideShare
6 PdfSR
1 General Purpose Computations Using Graphics Hardware, http://www.gpgpu.org/
2 K. E. Batcher. “Sorting networks and their applications”. in AFIPS Spring Joint Computer Conference, Arlington, VA, Apr. 1968, pages 307–314.
3 D.E. Knuth. The Art of Computer Programming. Vol. 3: Sorting and Searching (second edition). Menlo Park: Addison-Wesley, 1981.
4 M. Ajtai, J. Komlos, Szemeredi. “Sorting in parallel steps”. Combinatorica 3. 983, pp. 1 -19.
5 S. G. Akl. “Parallel Sorting Algorithms”, Academic Press, 1985.
6 J. H. Reif, L. G. Valiant. “A Logarithmic Time Sort for Linear Size Networks”. Journals of the ACM, 34(1): 60 – 76, 1987.
7 G.E. Blelloch,” Vector Models for Data-Parallel Computing”. The MIT Press, 1990.
8 G.E. Blelloch, C.E. Leiserson, B.M. Maggs, C.G. Plaxton, S.J. Smith, M. Zagha. “A Comparison of Sorting Algorithms for the Connection Machine CM-2”. in Annual ACM Symp. Paral. Algo: Arc. 1991, Pages 3 -16.
9 F. T. Leighton, “Introduction to Parallel Algorithms and Architectures: Arrays, Trees and Hypercubes”. Morgan Kaufmann, 1992.
10 J.H. Reif. ”Synthesis of Parallel Algorithms”. Morgan Kaufmann, San Mateo, CA, 1993.
11 H. Li, K.C. Sevcik. “Parallel Sorting by Over-partitioning”. in Annual ACM Symp. Paral. Algor.Arch. 1994, pages 46 – 56.
12 A. Tridgell, R. P. Brent. “A general-purpose parallel sorting algorithm” in International J. of High Speed Computing 7 (1995), pp. 285-301.
13 N. Amato, R. Iyer, S. Sundaresan, Y. Wu. “A Comparison of Parallel Sorting Algorithms on Different Architectures” Texas A & M University, College Station, TX, 1998.
14 T.H. Cormen, C.E. Leiserson, R.L. Rivest, C. Stein. Introduction to Algorithms. 2nd edition, The MIT Press. 2001.
15 T. J. Purcell, C. Donner, M. Cammarano, H. Jensen, P. Hanrahan “Photon mapping on programmable graphics hardware”, in Annual ACM SIGGRAPH / Eurographics conference on Graphics Hardware, 2003, pp. 41 – 50.
16 J. D. Owens, D. Luebke, N. Govindaraju, M. Harris, J. Krüger, A. E. Lefohn, T. J. Purcell. “A Survey of General-Purpose Computation on Graphics Hardware.” in Eurographics 2005, State of the Art Reports, August 2005, pp. 21-51.
17 A. Greb, G. Zachmann. “GPU-AbiSort: Optimal Parallel Sorting on Stream Architectures” in IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing. 2006.
18 NVidia CUDA GPGPU Framework. http://www.nvidia.com/
19 S. Sengupta, M. Harris, Y. Zhang, J. D. Owens. “Scan primitives for GPU computing,” in Graphics Hardware 2007, Aug. 2007, pp. 97–106.
20 D. Cedermann, P. Tsigas. “A practical quicksort algorithm for graphic processors”, Tech. Rep, Chalmers University of Technology and Goteberg University, 2008.
21 N. Satish, M. Harris, M. Garland. “Designing efficient sorting algorithms for manycore GPUs”. In Proceedings of the 2009 IEEE International Symposium on Parallel & Distributed Processing. May 23-29, 2009, pp.1-10.
22 OpenCL Specification, http://www.khronos.org/opencl/
23 F. Gul, O. Usman Khan, B. Montrucchio, P. Giaccone. “Analysis of Fast Parallel Sorting Algorithms for GPU Architectures”. in Proceeding FIT '11 Proceedings of the 2011 Frontiers of Information Technology Pages 173-178.
24 P. Helluy. “A portable implementation of the radix sort algorithm in OpenCL”. http://code.google.com/p/ocl-radix-sort/ May 2011
25 B. Gaster, L. Howes, D.R. Kaeli, P. Mistry, D. Schaa. Heterogeneous Computing with OpenCL. Morgan Kaufmann. 2011.
26 AMD Accelerated Parallel Processing OpenCL Programming Guide, Advanced Micro Devices, Inc. 2012. http://developer.amd.com/appsdk
27 M. Scarpino. OpenCL in Action. Manning Publications, 2011.
Dr. Krishnahari Thouti
- India
Dr. S. R. Sathe
- India