Branching causes the number of problems to explode.
The most work is at the bottom of the tree.
The problems lower in the tree are smaller.
The most work is at the top of the tree.
General Master Theorem
General Master Theorem
Let a⩾1,b>1 be constants, f(n) be a function, and T(n) be defined on the non-negative integers by the recurrence
T(n)=aT(bn)+f(n)
Then T(n) has the following asymptotic bounds:
If f(n)=O(nlogba−ϵ) for some constant ϵ>0, then T(n)=Θ(nlogba).
If f(n)=Θ(nlogba), then T(n)=Θ(nlogbalogn).
If f(n)=Ω(nlogba+ϵ) for some constant ϵ>0, and if af(bn)⩽cf(n) for some constant c<1 and all sufficiently large n, then T(n)=Θ(f(n)).
The Master Theorem does not cover all cases!
Here's something to be noticed, that's the meaning of ϵ. f(n)=lognn is not O(n1−ϵ) mentioned in case 1, cos for all ϵ>0, lognn is always Ω(n1−ϵ).
Ignoring Floors and Ceilings
The actual recurrence of MergeSort is T(n)=T(⌈2n⌉)+T(⌊2n⌋)+Θ(n).
How to get the real time complexity of this recurrence?
We can transform the recurrence into a more familiar form, by defining a new function in terms of the one we want to solve. This method is called Domain transformation.
First, let's overestimate the time bound, we have the following relation to eliminate the ceiling:
T(n)⩽2T(⌈2n⌉)+Θ(n)⩽2T(2n+1)+Θ(n)
Define a new function S(n)=T(n+α), choosing the constant α so that S(n) satisfies the simpler recurrence S(n)⩽2S(2n)+Θ(n).
We just need to make 2α+1−α=0 and get α=2 to simplifies the recurrence.
S(n)⩽S(2n)+Θ(n)⟹S(n)=O(nlogn). Then we have T(n)=S(n−2)=O((n−2)log(n−2))=O(nlogn).
Similarly, we get the matching lower bound T(n)=Ω(nlogn). Therefore, T(n)=Θ(nlogn).
Similar domain transformations can be used to remove floors, ceilings, and even lower order terms from any divide and conquer recurrence
But now that we realize this, we don't need to bother grinding through the details ever again
Reduce and Conquer
We might not need to consider all subproblems.
In fact, sometimes only need to consider one subproblem.
The "combine" step will also be easier, or simply trivial…
It is also called decrease and conquer
The Search Problem
Worst-case runtime is Θ(n).
Binary Search for a sorted array.
BinarySearch(A, x)
1 2 3 4 5 6 7 8 9
left := 1, right := n while true middle := (left + right) / 2 if A[middle] = x return middle elseif A[middle] < x left := middle + 1 else right := middle - 1
This is not completely true, cos it may not terminate. It doesn't take the case when x is not in the array into consideration.
1 2 3 4 5 6 7 8 9 10 11
BinarySearch(A, x):
left := 1, right := n while left <= right middle := (left + right) / 2 if A[middle] = x return middle elseif A[middle] < x left := middle + 1 else right := middle - 1
T(n)⩽T(2n)+Θ(1)⟹T(n)=O(logn)
Peak finding
Global max takes more time: Sequential scan needs O(n) time and it's inevitable. Finding a peak costs much less time.
An element A[i] is a peak (local maximum) if it is no smaller than its adjacent elements.
Every non-empty (limited) array A has at least one peak.
Find middle element.
Compare middle element to its adjacent elements.
Reduce the array to one of the two splits.
There must exist a peak in the part containing the large neighbor.
Recurse.
1 2 3 4 5 6 7 8 9 10 11
PeakFinding(A):
left := 1, right := n while left <= right # Current array is not empty. middle := (left + right) / 2# Get the middle element. if middle > left and A[middle - 1] > A[middle] # If middle < left neighbor (if there is such neighbor), recurse into left part. right := middle - 1 elseif middle < right and A[middle + 1] > A[middle] # Same as above. left := middle + 1 else# If middle is a peak, return it. return middle
Time complexity is O(logn).
Peak finding in 2D
Every non-empty 2D array has at least one peak. Start from a node, follow an increasing path and eventually must reach a peak.
Algorithm 1
Compress each column into one element, resulting an 1D array.
Use maximum of each column to represent that column.
Run previous algorithm on the 1D array and return a peak.
Time complexity is O(n2)+O(logn)=O(n2).
Algorithm 2
Scan the middle column and find the max element m.
If m is a peak, then return it and we are done.
Otherwise, left of right neighbor of m is bigger than m.
Recurse into that part.
Correctness
Max of middle column is a peak; or a peak exists in the part containing the large neighbor, and that peak is the max of its column.
A peak (found by the algorithm) in the part containing the large neighbor is also a peak in the original matrix.
The algorithm eventually returns a peak of some (sub)matrix.
T(n)⩽T(2n)+Θ(n)⟹T(n)=O(n)
The mistake occurs in T(2n). We still need to do a O(n) search to find the maximum value in the current column. Cos this is a 2D problem, requiring two variable to describe.
However, we use one variable in Matrix Multiplication problem because the matrix is reduced in proportion.
Algorithm 3
Scan the cross and find max element m.
If m is a peak, then return it and we are done.
Otherwise, some neighbor of m is bigger than m.
Recurse into that quadrant.
However this is wrong. False Claim: A peak (found by the algorithm) in the quadrant containing the large neighbor is also a peak in the original matrix.
To fix it, scan the cross and boundary (window frame) instead of just scanning the cross.