Sorting in C

2018-01-27, post № 190

C, programming, #algorithm, #bubble sort, #insertion sort, #merge sort, #natural merge sort, #quicksort, #selection sort, #time complexity

Sorting arrays of integers is an integral part of computing. Therefore, over the years, a lot of different approaches to sorting where found and resulted in a wide range of sorting algorithms existent today.
Of these the most common algorithms include quicksort — maybe one of the best known algorithms by name —, merge sort, natural merge sort — my personal favourite out of this list — and insertion, selection and bubble sort — more trivial sorting approaches. In this post, I will look at the aforementioned algorithms alongside a C implementation. The full C source code can be seen below and also downloaded.
All sorting algorithm implementations take an integer pointer and an array length as input; sorting the array in-place — manipulating the pointed to memory.

I) Quicksort

A fundamental observation used by most sorting algorithms is that a singleton array — an array which only consists of one element — is sorted. Thus one can write an algorithm which recursively operates on already sorted arrays, as every array has a sorted base (all array elements interpreted as singleton arrays).
Quicksort first chooses one of the input array’s elements as its pivot element — my implementation always chooses the first element — and splits the remaining array into two subarrays, one containing all array elements less than or equal to the pivot, the other one containing those greater than the pivot. The array is rearranged such that the first subarray is followed by the pivot element which is followed by the second subarray. Both subarrays are recursively sorted; singleton arrays mark the end of recursion as they are already sorted. Quicksort — as its name implies is a rather efficient sorting algorithm, having an average time complexity of $\mathcal{O}(n\cdot\log{}n)$ .
In the following I show my C implementation. It first creates a swap array for storing intermediate values while permuting the array and then calls its recursive part which initiates the sorting.

// quick sort (recursive part)
void _qsort(int *Arr, int *Swp, int len) {
    // split array at pivot (first element) and store in swap
    int l = 0, r = len-1;
    for (int j = 1; j < len; j++)
        if (Arr[j] < Arr[0]) Swp[l++] = Arr[j];
        else                 Swp[r--] = Arr[j];

    // move swap to array
    Swp[l] = Arr[0];
    for (int j = 0; j < len; j++)
        Arr[j] = Swp[j];

    // recursively sort split arrays
    if (l > 1)       _qsort(Arr, Swp, l);
    if (len-r-1 > 1) _qsort(Arr+l+1, Swp+l+1, len-r-1);
}

// quick sort (initialization)
void QSort(int *Arr, int len) {
    if (len < 2) return;

    int *Swp = malloc(len*sizeof(int));
    _qsort(Arr, Swp, len);
    free(Swp);
}

II) Merge Sort

As quicksort, merge sort also fundamentally uses the inherent sorted nature of singleton arrays. However, in contrast to quicksort, merge sort does not split the input array into a correctly placed pivot and two arrays left to sort, but rather uses a merging algorithm to merge two already sorted arrays into one sorted array — conceptually moving from the bottom up instead of from the top down.
To merge two sorted subarrays, simply take the smallest of the first elements of both subarrays to create a new array; repeating until both subarrays are empty. Once a merging function is implemented, simply recursively split the input array and merge all singleton arrays together to sort the entire array. As quicksort, merge sort also has an average time complexity of $\mathcal{O}(n\cdot\log{}n)$ .

// merge two arrays located in memory next to each other
void merge(int *Arr, int *Swp, int llen, int rlen) {
    // build array by choosing the smallest of both
    // array's first elements, merging both arrays
    int len = llen+rlen, l = 0, r = 0;
    for (int j = 0; j < len; j++) {
        if (l < llen && r < rlen)
            if (Arr[l] < Arr[llen+r]) Swp[j] = Arr[l++];
            else                      Swp[j] = Arr[llen+r++];
        else if (l < llen) Swp[j] = Arr[l++];
        else if (r < rlen) Swp[j] = Arr[llen+r++];
    }

    // move swap to array
    for (int j = 0; j < len; j++)
        Arr[j] = Swp[j];
}

// merge sort (recursive part)
void _msort(int *Arr, int *Swp, int len) {
    // split arrays' lenghts, sort split arrays
    int llen = len/2, rlen = len-llen;
    if (llen > 1) _msort(Arr, Swp, llen);
    if (rlen > 1) _msort(Arr+llen, Swp, rlen);

    // merge arrays
    merge(Arr, Swp, llen, rlen);
}

// merge sort (initialization)
void MSort(int *Arr, int len) {
    if (len < 2) return;

    int *Swp = malloc(len*sizeof(int));
    _msort(Arr, Swp, len);
    free(Swp);
}

III) Natural Merge Sort

Instead of relying on splitting the input array into singleton lists, natural merge sort searches subarrays which naturally appear sorted and merges them to form one sorted list. The same merging algorithm as in traditional merge sort is used; as merge sort, natural merge sort also has an average time complexity of $\mathcal{O}(n\cdot\log{}n)$ ^[1].

// natural merge sort
void NMSort(int *Arr, int len) {
    if (len < 2) return;

    int *Swp = malloc(len*sizeof(int));
    for (int k = 0, j = 0; j < len; k = j+1) {
        for (j = k; j < len-1 && Arr[j] <= Arr[j+1];) j++;
        if (j < len) merge(Arr, Swp, k, j-k+1);
    }
    free(Swp);
}

IV) Insertion Sort / Selection Sort

Being a rather trivial sorting algorithm, insertion sort builds up a new array by always removing the input array’s smallest element. Equivalently, selection sort always selects the input array’s smallest element. Thus I have only implemented insertion sort, not using any swap memory but only swapping array elements with each other. Insertion sort is a trivially brute force approach to sorting, making it a rather inefficient algorithm with an average time complexity of $\mathcal{O}(n^2)$ .

// insertion sort
void ISort(int *Arr, int len) {
    if (len < 2) return;

    // loop through array elements
    for (int j = 0; j < len; j++) {
        // find minimum element
        int m = j;
        for (int i = j+1; i < len; i++)
            if (Arr[i] < Arr[m]) m = i;

        // swap minimum element with current element
        if (j != m) {
            int swp = Arr[j];
            Arr[j] = Arr[m];
            Arr[m] = swp;
        }
    }
}

V) Bubble Sort

Bubble sort works by iteratively finding neighbouring elements which are misaligned, swapping them to sort the entire list. Presumably, the rather amusing name comes from observing how elements behave while they are being sorted, bubbling to the top like a pop’s fizz. Its average time complexity is fairly inefficient at $\mathcal{O}(n^2)$ .

// bubble sort
void BSort(int *Arr, int len) {
    while (len-- > 1)
        for (int j = 0; j < len; j++)
            if (Arr[j] > Arr[j+1]) {
                int swp = Arr[j];
                Arr[j] = Arr[j+1];
                Arr[j+1] = swp;
            }
}

Conclusion

In conclusion, one most likely will not have to worry about implementing sorting algorithms, as most modern languages supply an essential tool belt to the user, complete with various sorting methods and predefined data structures dependent on sorting. Nevertheless, the theory and diversity of sorting algorithm fascinates me, as it shows how widely different tactics can be used to solve the same seemingly mundane problem; sorting an array of integers.
All implementations shown in this post have been tested several thousands of times on arrays with varying lengths — to see the test function take a look at the full source code.

Source code: sorting-in-c.c

Extra assets: sorting-in-c_development-version-2.c, sorting-in-c_development-version-3.c, sorting-in-c_development-version-1.c, sorting-in-c_development-version-4.c, sorting-in-c_natural-merge-sort.c, sorting-in-c_newer-version.c

Footnotes

▲ [2018-03-03] As I now know, the implementation seen below has most likely a considerably worse time complexity as stated above, as it implements an inefficient natural merge sort flavour which merges small sublists step-by-step into an ever growing sorted list instead of recursively merging all sublists into a sorted list (merge sort’s behaviour). It does, however, not rely on recursion.