This sorting library gives programmers an easy way to increase sorting performance by harnessing the highly parallel computational power available in todays graphics cards.

Tests shows that it can outperform highly optimized CPU-based Quicksort with a factor of 10 on cards commonly available in 2007. Sorting 16 million floating point numbers or integers take less than half a second!


This graph shows the performance when sorting an array of uniformly distributed floating point numbers on a 8800GTX graphics card. The comparison is made between the GPU Quicksort Library, Radix sort, Radix/Merge sort, GPUSort (bitonic sort) and the Introsort (a combination of Quicksort and Heapsort) algorithm in the standard C++ library (on a Opteron 265 1.8GHz). More results are available in the technical report [1].


The source code of the GPU Quicksort Library is released under the Creative Commons License. It is bundled with a testbench for evaluation. It contains a Makefile for Linux and Visual Studio project files for Windows.