Jonathan. Frech’s WebBlog

Sorting in C (#190)

Jonathan Frech,

Sorting arrays of integers is an in­te­gral part of com­put­ing. Therefore, over the years, a lot of dif­fer­ent ap­proach­es to sorting where found and resulted in a wide range of sorting algorithms existent today.
Of these the most com­mon algorithms in­clude quick­sort — maybe one of the best known algorithms by name —, merge sort, natural merge sort — my personal favourite out of this list — and insertion, selection and bubble sort — more trivial sorting ap­proach­es. In this post, I will look at the aforementioned algorithms along­side a C im­ple­men­ta­tion. The full C source code can be seen below and also downloaded.
All sorting algorithm im­ple­men­ta­tions take an in­te­ger pointer and an array length as input; sorting the array in-place — manipulating the pointed to memory.

I) Quick­sort

A fundamental observation used by most sorting algorithms is that a singleton array — an array which on­ly consists of one element — is sorted. Thus one can write an algorithm which recursively operates on already sorted arrays, as every array has a sorted base (all array elements interpreted as singleton arrays).
Quick­sort first chooses one of the input array’s elements as its pivot element — my im­ple­men­ta­tion al­ways chooses the first element — and splits the re­main­ing array into two subarrays, one containing all array elements less than or equal to the pivot, the oth­er one containing those great­er than the pivot. The array is rearranged such that the first subarray is fol­low­ed by the pivot element which is fol­low­ed by the second subarray. Both subarrays are recursively sorted; singleton arrays mark the end of recursion as they are already sorted. Quick­sort — as its name implies is a rather efficient sorting algorithm, having an average time complexity of $\mathcal{O}(n\cdot\log{}n)$$\mathcal{O}(n\cdot\log{}n)$$\mathcal{O}(n\cdot\log{}n)$.
In the following I show my C im­ple­men­ta­tion. It first creates a swap array for storing intermediate values while permuting the array and then calls its recursive part which initiates the sorting.

// quick sort (recursive part)
void _qsort(int *Arr, int *Swp, int len) {
    // split array at pivot (first element) and store in swap
    int l = 0, r = len-1;
    for (int j = 1; j < len; j++)
        if (Arr[j] < Arr[0]) Swp[l++] = Arr[j];
        else                 Swp[r--] = Arr[j];

    // move swap to array
    Swp[l] = Arr[0];
    for (int j = 0; j < len; j++)
        Arr[j] = Swp[j];

    // recursively sort split arrays
    if (l > 1)       _qsort(Arr, Swp, l);
    if (len-r-1 > 1) _qsort(Arr+l+1, Swp+l+1, len-r-1);
}

// quick sort (initialization)
void QSort(int *Arr, int len) {
    if (len < 2) return;

    int *Swp = malloc(len*sizeof(int));
    _qsort(Arr, Swp, len);
    free(Swp);
}

II) Merge Sort

As quick­sort, merge sort also fundamentally uses the in­her­ent sorted nature of singleton arrays. How­ev­er, in contrast to quick­sort, merge sort does not split the input array into a correctly placed pivot and two arrays left to sort, but rather uses a merging algorithm to merge two already sorted arrays into one sorted array — con­cep­tu­al­ly moving from the bottom up in­stead of from the top down.
To merge two sorted subarrays, simply take the smallest of the first elements of both subarrays to create a new array; repeating until both subarrays are empty. Once a merging func­tion is im­ple­ment­ed, simply recursively split the input array and merge all singleton arrays to­geth­er to sort the en­tire array. As quick­sort, merge sort also has an average time complexity of $\mathcal{O}(n\cdot\log{}n)$$\mathcal{O}(n\cdot\log{}n)$$\mathcal{O}(n\cdot\log{}n)$.

// merge two arrays located in memory next to each other
void merge(int *Arr, int *Swp, int llen, int rlen) {
    // build array by choosing the smallest of both
    // array's first elements, merging both arrays
    int len = llen+rlen, l = 0, r = 0;
    for (int j = 0; j < len; j++) {
        if (l < llen && r < rlen)
            if (Arr[l] < Arr[llen+r]) Swp[j] = Arr[l++];
            else                      Swp[j] = Arr[llen+r++];
        else if (l < llen) Swp[j] = Arr[l++];
        else if (r < rlen) Swp[j] = Arr[llen+r++];
    }

    // move swap to array
    for (int j = 0; j < len; j++)
        Arr[j] = Swp[j];
}

// merge sort (recursive part)
void _msort(int *Arr, int *Swp, int len) {
    // split arrays' lenghts, sort split arrays
    int llen = len/2, rlen = len-llen;
    if (llen > 1) _msort(Arr, Swp, llen);
    if (rlen > 1) _msort(Arr+llen, Swp, rlen);

    // merge arrays
    merge(Arr, Swp, llen, rlen);
}

// merge sort (initialization)
void MSort(int *Arr, int len) {
    if (len < 2) return;

    int *Swp = malloc(len*sizeof(int));
    _msort(Arr, Swp, len);
    free(Swp);
}

III) Natural Merge Sort

In­stead of relying on splitting the input array into singleton lists, natural merge sort searches subarrays which naturally ap­pear sorted and merges them to form one sorted list. The same merging algorithm as in traditional merge sort is used; as merge sort, natural merge sort also has an average time complexity of $\mathcal{O}(n\cdot\log{}n)$$\mathcal{O}(n\cdot\log{}n)$$\mathcal{O}(n\cdot\log{}n)$⁠¹.

// natural merge sort
void NMSort(int *Arr, int len) {
    if (len < 2) return;

    int *Swp = malloc(len*sizeof(int));
    for (int k = 0, j = 0; j < len; k = j+1) {
        for (j = k; j < len-1 && Arr[j] <= Arr[j+1];) j++;
        if (j < len) merge(Arr, Swp, k, j-k+1);
    }
    free(Swp);
}

IV) Insertion Sort / Selection Sort

Being a rather trivial sorting algorithm, insertion sort builds up a new array by al­ways removing the input array’s smallest element. Equiv­a­lent­ly, selection sort al­ways selects the input array’s smallest element. Thus I have on­ly im­ple­ment­ed insertion sort, not using any swap memory but on­ly swapping array elements with each oth­er. Insertion sort is a trivially brute force ap­proach to sorting, making it a rather inefficient algorithm with an average time complexity of $\mathcal{O}(n^2)$$\mathcal{O}(n^2)$$\mathcal{O}(n^2)$.

// insertion sort
void ISort(int *Arr, int len) {
    if (len < 2) return;

    // loop through array elements
    for (int j = 0; j < len; j++) {
        // find minimum element
        int m = j;
        for (int i = j+1; i < len; i++)
            if (Arr[i] < Arr[m]) m = i;

        // swap minimum element with current element
        if (j != m) {
            int swp = Arr[j];
            Arr[j] = Arr[m];
            Arr[m] = swp;
        }
    }
}

V) Bubble Sort

Bubble sort works by iteratively finding neighbouring elements which are misaligned, swapping them to sort the en­tire list. Presumably, the rather amusing name comes from observing how elements be­have while they are being sorted, bubbling to the top like a pop’s fizz. Its average time complexity is fairly inefficient at $\mathcal{O}(n^2)$$\mathcal{O}(n^2)$$\mathcal{O}(n^2)$.

// bubble sort
void BSort(int *Arr, int len) {
    while (len-- > 1)
        for (int j = 0; j < len; j++)
            if (Arr[j] > Arr[j+1]) {
                int swp = Arr[j];
                Arr[j] = Arr[j+1];
                Arr[j+1] = swp;
            }
}

Conclusion

In conclusion, one most likely will not have to worry about im­ple­ment­ing sorting algorithms, as most modern languages supply an es­sen­tial tool belt to the user, com­plete with var­i­ous sorting methods and predefined data structures de­pen­dent on sorting. Nev­er­the­less, the theory and diversity of sorting algorithm fascinates me, as it shows how widely dif­fer­ent tactics can be used to solve the same seemingly mundane prob­lem; sorting an array of integers.
All im­ple­men­ta­tions shown in this post have been tested several thousands of times on arrays with varying lengths — to see the test func­tion take a look at the full source code.


[1][2018-03-03] As I now know, the im­ple­men­ta­tion seen below has most likely a considerably worse time complexity as stated above, as it implements an inefficient natural merge sort flavour which merges small sublists step-by-step into an ever growing sorted list in­stead of recursively merging all sublists into a sorted list (merge sort’s be­hav­iour). It does, how­ev­er, not rely on recursion.