Sunday, February 7, 2016

Heap Sort

This sorting algorithm has 3 main parts.

  1. Build Heap
  2. Swap 1st element and last element, Decrease Size by 1
  3. Heapify
you might be wondering now what the hell is heap?!!! Heap is a binary tree representation in which parent element will always have either highest or lowest element among its children - left child and right child. This binary tree will be represented in the array. If parent is holding highest element it is Max heap, else if parent is holding smallest element then it is Min heap. 

Below is the image of the Heap representation in tree form and also in array. 



As you can see in the tree, its a Max heap that is all the parents are having highest value when compared to (root, Left and Right). So to convert it into array - First element will be the root, followed by its immediate left child and then its immediate right child, followed by left child's children and then right child's children.
So if you have a parent index, say i, then its left child is at 2*i + 1, similarly its right child is present at 2*i+2. Similarly given an index j, the parent is present at location i/2.

First step, given an array we need to transform it into heap structure. Below is the code for doing it.

int BuildMaxHeap(int* pIntArray, int nSize)
{
int nReturnValue = 0;

if (pIntArray == 0)
{
return ERR_NULLINPUTARRAY;
}

if (nSize <= 0)
{
return ERR_INVALIDARRAYSIZE;
}

for (int i = nSize / 2; i >= 0; i--)
{
MaxHeapify(pIntArray, nSize, i);
}


return nReturnValue;
}

We start from the last parents that is parents just before leaves and check if it is satisfying Max Heap property. If not then we max heapify it. So once it is heapified we will move to next parent and repeat the procedure until root. 

int MaxHeapify(int* pIntArray, int nSize, int nIndex)
{
int nReturnValue = 0;
int nLeftIndex = LeftChildHeap(nIndex);
int nRightIndex = RightChildHeap(nIndex);
int nLargestIndex = nIndex;
int nTemp = 0;

if (nLeftIndex < nSize && pIntArray[nLeftIndex] > pIntArray[nLargestIndex])
{
nLargestIndex = nLeftIndex;
}

if (nRightIndex < nSize && pIntArray[nRightIndex] > pIntArray[nLargestIndex])
{
nLargestIndex = nRightIndex;
}

if (nLargestIndex != nIndex)
{
nTemp = pIntArray[nLargestIndex];
pIntArray[nLargestIndex] = pIntArray[nIndex];
pIntArray[nIndex] = nTemp;

MaxHeapify(pIntArray, nSize, nLargestIndex);
}

return nReturnValue;
}

So now what is this max heapify? - This will check among Parent, Left Child and Right Child which is largest element. Largest element will be swapped with Parent. Suppose left child had largest, then Parent and left elements are swapped. Once swapped we do max heapify to the exchanged element. That is, in our case if left element was exchanged with Parent, then now max heapify the new left childs children by treating it as root.

So once you are done with this you are now having the heapified array. So by max heapify theory root (0th element in array) should have the highest element among all the elements in the array/heap. To get the sorted array, Take the 0th element and swap with last element of the array. Then decrease the size of the array by one. Now heapify the array from the root. When you do this next highest element will come to the root of the heap. Repeat the process until size becomes 1. 

int HeapSortIntegers(int* pIntArray, int nSize)
{
int nReturnValue = 0;
int nTemp = 0;

if (pIntArray == 0)
{
return ERR_NULLINPUTARRAY;
}

if (nSize <= 0)
{
return ERR_INVALIDARRAYSIZE;
}

BuildMaxHeap(pIntArray, nSize);

for (int i = nSize - 1, j = 1; i >= 1; i++, j++)
{
nTemp = pIntArray[0];
pIntArray[0] = pIntArray[i];
pIntArray[i] = nTemp;

MaxHeapify(pIntArray, nSize - j, 0);

}

return nReturnValue;
}

This looks so completed right? Does it have any benefit? Well, yes its runtime is O(nlogn). Its average case = worst case runtime and it sorts inplace. 

Wow then this should be better than quicksort as quicksort takes O(n^2) in its worst case scenario? 
Well yes and no. Yes because heapsort is always O(nlogn). No because given an array, for which quick sort runs in O(nlogn), if we execute for heap as well as quick sorts, then quick will be running better as heap sort has higher constants in its runtime 

Implementation is based on Chapter 6 - HeapSort of Introduction to Algorithms 3rd Edition [Thomas H Cormen, Charles E Leiserson, Ronald L Rivest, Clifford Stein]

Saturday, February 6, 2016

Quick Sort

This algorithm is quite tricky in terms of flow or concept. Again this algorithm is based on divide and conquer technique. In case of merge sort, importance is given to combining the divided list of elements where as dividing was very simple. But in case of quick sort, its opposite. Here importance is in dividing and there is no work for combining elements. In every iteration we try to find an element and its correct position such that all the elements are present before that element are less than that element and all the elements which are present after that element are having value greater than element. Well you may say thats the concept of ascending order, yeah true but in the first iteration only this element is brought to correct position. All those elements who are at its right are still not in their correct positions, similarly all the elements to its left. Position of this element forms the partition point. Once I get this index, I will re run my quick sort algorithm from lowest position to index - 1. Similarly from index + 1 till highest position. 

Well now we know some idea of quick sort, so what is its complexity? 
Hmmm its O(n^2). 

What?? then why we need this? We could have gone with Merge sort only right? 
Well first thing, this algorithm sorts inplace, where as merge sort needs extra space remember... Also average runtime is theta(nlogn). It is better than merge sort in most cases. 

Oh! then when does the worst case occur for quick sort?
When array is already sorted. But we can use random picking of pivot. Before starting of the pivot finding we choose a random index between low and high and then swap that element with highest index element. This kind of forces it to run in average case runtime.

Hmmm good now lets see the code.

So initial case is generating the recursive function which keeps on calling itself until all the elements are sorted.

int RandomizedQuickSort(int* pIntArray, int nlow, int nhigh)
{
int nReturnValue = 0;
int nKey = -1;

if (nlow < nhigh)

{
nKey = RandomizedPartition(pIntArray, nlow, nhigh);
RandomizedQuickSort(pIntArray, nlow, nKey - 1);
RandomizedQuickSort(pIntArray, nKey + 1, nhigh);
}

return nReturnValue;

}

To force the algorithm to run in average time, we need to randomize the pivot selection. So here is the code of RandomizedPartition which will randomly select an index and swaps its element with highest indexed element.

int RandomizedPartition(int* pIntArray, int nlow, int nhigh)
{
int nReturnValue = 0;

int nDiff = nhigh - nlow;
int nTemp = 0;

srand(time(0));


nDiff = rand() % nDiff;


nTemp = pIntArray[nhigh];

pIntArray[nhigh] = pIntArray[nlow + nDiff];
pIntArray[nlow + nDiff] = nTemp;

nReturnValue = PartitionQuickSort(pIntArray, nlow, nhigh);


return nReturnValue;

}

Finally, here is the quick sort. 

int PartitionQuickSort(int* pIntArray, int nlow, int nhigh)
{
int nCompareElement = pIntArray[nhigh];
int nTemp = 0;
int i = nlow - 1;

for (int j = nlow; j < nhigh; j++)

{
if (pIntArray[j] <= nCompareElement)
{
i++;
nTemp = pIntArray[i];
pIntArray[i] = pIntArray[j];
pIntArray[j] = pIntArray[i];
}
}

nTemp = pIntArray[i + 1];

pIntArray[i + 1] = pIntArray[nhigh];
pIntArray[nhigh] = nTemp;

return i + 1;

}

Basically I am trying to find the position of the last element in the given array. Aim is to find the position such that all elements which are less than that element should be before that position and all the elements which are greater that element should be after that position. Order of such elements is not a concern. So how do I find the correct position? I have two indices 'i' and 'j'. Index j is for iteration, i for position swapping. Each element starting from lowest index will be compared with the highest index element. If element is greater then just go ahead with doing anything. If smaller, than increase the index i, swap the elements of index i and j. Basically you are pushing the greater elements to other side. Go on doing this until j reaches highest - 1. Now i + 1 position is the correct position for the highest index element. So swap this element with element in i + 1 index. Finally i + 1 is your pivot.

Actually complete implementation is present here, but still this can be found in my github - https://github.com/harsha-kadekar/LibraryOfAlgorithms . Also implementation is based on the Chapter 7 - QuickSort of Introduction to Algorithms 3rd Edition [Thomas H Cormen, Charles E Leiserson, Ronald L Rivest, Clifford Stein]