Given an array of size n
and a window size of w
, how would you find the
maximum in every sub-array of size w?
I was asked this question in an interview and this is a postmortem of how I screwed it up.
It might be a good idea to pause here and work it out.
The naive way of doing it is pretty straight forward. The runtime complexity is
O(nw)
and you don’t need any extra memory.
def naive(inp, window):
for (index, _) in enumerate(inp[window - 1:]):
print max(inp[index: index + window])
>>> naive([12, 1, 78, 90, 57, 89, 56], 3)
78 90 90 90 89
This is definitely not any interesting. The problem should remind you of dynamic programming since the max of several sub arrays are computed repeatedly. Caching them should lead to a better complexity, but I couldn’t think of a particularly good way to do it.
Slightly better - O(n log w)
You can use an ad-hoc sorted list of size w
with a btree or heap to improve
the results. Whenever the window is moved right, insert the new element into
this list and remove the element that was pushed out of the window. After each
iteration, the head of the sorted list is the required value. Insertion and
removal can be done in log(w)
and hence the overall complexity would be O(n
log(w))
.
Not and impressive solution and this is where I got stuck. We moved onto other things instead of more time.
O(n) solution
A slight tweak to the last problem is all that is needed. Instead of maintaining a sorted list of size w, we just need the 2 largest values. The intuition is that an element is useful only if it is the biggest element in the window.
def max(inp, window):
# Compute the base case
stack = inp[:window]
stack.sort()
stack.pop(0)
for (index, num) in enumerate(inp[window - 1:]):
# Remove all elements that are smaller than the current element
while stack and num >= stack[0]:
stack = stack[1:] # tail should be O(1)
# Insert current element into the stack
stack.insert(0, num) # cons should be O(1)
print "%d" % stack[-1:][0] # last element
previous = inp[index]
# Remove previous value from stack. This is significant only if the
# previous value was the largest in the last sliding window, and if so
# it will be at the top of the stack.
if previous == stack[-1:][0]:
stack = stack[:-1]
>>> max([12, 1, 78, 90, 57, 89, 56], 3)
78 90 90 90 89
Not bad at all!