# CLEAN CODE: HEAP, GREAT FOR SOLVING INTERVIEW QUESTIONS INVOLVING: FIND TOP FIVE, TOP N, BOTTOM N ETC

By: Saurav

2017-12-25 03:45:00 UTC

I am working on coming up with a pattern or a process flow to follow in an interview. One of the haunting data structure for me, as well as many folks, is a heap. Max heap or Min heap. I have faced interview problems like

1. Find top n words in a sentence
2. Find k max elements in an array
3. Find k max elements according to repetitions

and yes, I wasn't able to solve it in the best way at that time and it hurts till today.

So I thought let's spend time with the beast, get a good refresher and try to tame it.

A good tutorial is on youtube by Gayle Laakmann McDowell herself. Its great if you are in the flow of solving problems but at first its a bit complex. Another great resource is at Sitepoint. That is really easy to understand and is a great work.

# Heap

A heap is essentially a binary tree, which follows a specific pattern. If the parent of two nodes is smaller than the nodes, and this property holds for the whole tree, it's a min heap. If the parent is greater than both, it's a max heap.

In a heap, the root will be the maximum or minimum among all the elements of the tree. However, if the elements are depicted as in array, it won't be sorted. So, keep that in mind.

It's great if we want to get the top n or bottom n elements with restricted space. The time complexity of converting an array into the heap is O(n) (Although it looks like O(nlogn), but for further reading view this article )).

We can implement a heap by using Node class but a better way from space efficiency is using an array. Elements aread from top to bottom, left to right.

For the heap binary tree to array conversion:

For an even and odd number of elements:

```def left_child_index(index)
if input_array.size%2 == 0
2*index
else
2*index +1
end
end
```

The function above gives the position of an element's left child in an array.

``` def right_child_index(index)
if input_array.size%2 == 0
2*index +1
else
2*index +2
end
end
```

As always. lets write some tests first:

```new_heap = Heap.new([2,4,1,7,0,3,8])
p new_heap.build_heap == [8, 7, 3, 4, 0, 2, 1]

new_heap1 = Heap.new([10,4,8,7,0,12,2,8])
p new_heap1.build_heap == [12, 10, 8, 8, 0, 4, 2, 7]

new_heap1 = Heap.new([])
p new_heap1.build_heap == []

```

Implementing the Heap class:

```class Heap
attr_accessor :input_array

def initialize(input_array)
@input_array = input_array
end

def left_child_index(index)
if input_array.size%2 == 0
2*index
else
2*index +1
end
end

def right_child_index(index)
if input_array.size%2 == 0
2*index +1
else
2*index +2
end
end

def is_greater_than_children(index)
input_array[index] >= input_array[left_child_index(index)] and input_array[index] >= input_array[right_child_index(index)] ? true : false
end

def is_a_leaf_node(index)
(index >= input_array.size/2) ? true : false
end

def max_heapify(index)

return if is_a_leaf_node(index) or is_greater_than_children(index)

largest_index = input_array[left_child_index(index)] > input_array[right_child_index(index)] ? left_child_index(index) : right_child_index(index)

if input_array[index] < input_array[largest_index]
input_array = swap(index,largest_index)
end

max_heapify(largest_index)

end

def swap(index,largest)
temp = input_array[index]
input_array[index] = input_array[largest]
input_array[largest] = temp
return input_array
end

def build_heap
(input_array.size/2 - 1).downto(0).each do |index|

max_heapify(index)
end
return input_array
end

def get_max_n_numbers(n)
return input_array if n >= input_array.size
top_n = Array.new

(0..n-1).each do
build_heap
top_n << input_array.shift
end

end
end
```

We need the build_heap function to run heapify function above. We only need to build from the parent of the leaf nodes so it starts from input.size/2-1.

I have also added a get_max_n_numbers function which can give n max elements in the heap.

This is always the core of the questions related to the heap.

Remember! When the interview question asks for:
Find top n elements / find winner / find max / find min / find most repetitive / find least repetitive / find top customer etc.
Only two things should come to your mind.
> Heap or sort. Heap will take O(n) while sort will take at least O(nlogn). Depending upon the heap size you can control the time complexity of the heap operations O(klogk).

In few questions where there is repetition (like finding most repetitive), Hash also comes in handy along with heap.

Owned & Maintained by Saurav Prakash

If you like what you see, you can help me cover server costs or buy me a cup of coffee though donation :)