Thursday, August 20, 2015

Sorting Algorithms - Merge Sort

Now merge sort is based on the divide and conquer concept. Idea is to divide the problem into smaller sub problems and solve those sub problems. Once you are able to solve those sub problems, join them together to get the solution for the bigger problem. Now in case of merge sort we divide the problem into exactly two halves. Take that halves and again divide them into two halves again. Like this we will go until exactly one element is left in each of the sub problems. Now we start merging them by comparing each elements. The one having smaller element will be placed first and bigger one will be placed after that. Below diagram will help you to understand it.

First image is of division work. It shows how the problem is divided into smaller sub problem.


Once we are successful in dividing the problem into smaller sub problem now its time to conquer it that is to merge them. Below image shows how it will be merged.

For division purpose in coding we are using recursive function. Recursive functions are those which will call itself again and again with modified parameters until a predefined goal is met. Below is the code snippet for this

int MergeSortAlgo(int* pIntArray, int nStart, int nSize)
{
int nReturnValue = 0;
int nMid = -1;

nMid = nSize / 2;
if (nMid > 0)
{
//This is divide part of the algorithm. That is we are dividing the array into smaller sub arrays.
MergeSortAlgo(pIntArray, nStart, nMid);
MergeSortAlgo(pIntArray, nStart + nMid, nSize - nMid);
//This is conquering part of the algorithm. Here once after division of the array we are trying to merge them into one.
MergeArray(pIntArray, nStart, nSize);
}

return nReturnValue;
}

As you can see MergeSortAlgo is called repeatedly by itself with different parameters. It is dividing the problem into two which is done in step nSize/2.

Once all division is done it is going further to Merge those divided sub problems. Below is the code which is doing exactly that

int MergeArray(int *pIntArray, int nStart, int nSize)
{
int nReturnValue = 0;
int nMid = nSize / 2;
int nFirstStart = nStart;
int nSecondStart = nStart + nMid;
int nFirstEnd = nStart + nMid - 1;
int nSecondEnd = nStart + nSize - 1;

int i = nFirstStart, j = nSecondStart, k = 0;

int* pTempArray = 0;
pTempArray = (int*)malloc(nSize*sizeof(int));
if (pTempArray == 0)
{
//Error Dynamic memory not allocated.
return ERR_DYNAMICMEMNOTALLOC;
}
else
{
while (k < nSize && (i <= nFirstEnd && j <= nSecondEnd))
{
if (pIntArray[i] < pIntArray[j])
{
pTempArray[k++] = pIntArray[i++];
}
else
{
pTempArray[k++] = pIntArray[j++];
}
}

while (i <= nFirstEnd)
pTempArray[k++] = pIntArray[i++];

while (j <= nSecondEnd)
pTempArray[k++] = pIntArray[j++];

i = 0, j = 0, k = 0;

for (k = 0; k < nSize; k++)
pIntArray[nStart++] = pTempArray[k];

free(pTempArray);

}

return nReturnValue;
}

Nothing special here. First I have allocated a temporory space whose size is sum of arrays of sub problems. Then I comparing one element of Sub problem1 and one element of Sub problem2. Whichever is smaller will be placed in the temparory array. This is repeated for every elements.

Okay the order of growth of merge sort is Theta(nlogn) log base is 2. How do we get this well wait for my post on Masters Theorem.

Code is present in location https://github.com/harsha-kadekar/LibraryOfAlgorithms.git
I am learning from "Introduction to Algorithms" book 3rd edition. Authors - Thomas H Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford  Stein.

No comments:

Post a Comment