## Quickselect vs Countingselect

Moderators: Darobat, RecursiveS, Dante Shamest, Bugdude, Wizard

### Quickselect vs Countingselect

Quickselect, based on quicksort, and counting select, based on counting sort.

Each is capable of finding the kth smallest element in an unsorted/sorted list/array.

This is some example code for a QuickSelect algorithm. This doesn't include the partition function.

Code: Select all
`// return the kth smallest itemint quickSelect(int items[], int first, int last, int k) {    int pivot = partition(items, first, last);    if (k < pivot-first) {        return quickSelect(items, first, pivot, k);    } else if (k > pivot) {        return quickSelect(items, pivot+1, last, k-pivot);    } else {        return items[k];    }}`

Counting select, based on the counting sort algorithm, looks something like this.

Code: Select all
`// return the kth smallest itemint countingSelect(int items[], int first, int last, int k) {    int counts[cap];    for (int c = 0; c < cap; c++) {        counts[c] = 0;    }    for (int i = first; i < last; i++) {        counts[items[i]] += 1;    }    int c = 0;    while (k >= 0) {        k -= counts[c++];    }    return c-1;}`

I wish to write a set of guidelines to decide which of the two algorithms is most appropriate for specific situations or data sets. I need to think of situations, and than implement guidelines as to which algorithm is the better choice.

For this, I need a little help distinguishing which algorithm has specific strengths in certain areas/data sets and varieties, etc.

Sasstraliss

Posts: 1
Joined: Sat Jun 02, 2012 3:42 am

### Re: Quickselect vs Countingselect

The first one seems to modify the input data. But this can easily be fixed by working on a copy of the items. Time should be similar to QuickSort in O(n*log n).

I think the second one looks "faster" (something like O(n+cap) ), but at the price of unpredictable and possibly very high memory allocation. I don't know how you get the value of cap, but I guess it's something like the maximum value of the array. And you have to keep in mind that numbers in the array might be negative, when using them as index for another array.
You just should run some tests on different sets of data. I would guess the second algorithm is faster as long as the values inside the array are "small enough" and the size of the array is "big enough", on small arrays with big values you waste too much time going through the count array. So basically you have to test on different array sizes and different value ranges to find out what "big enough" and "small enough" are.
Who needs a signature anyway.

exomo

Posts: 881
Joined: Fri Sep 26, 2003 12:30 pm