Solutions to Algorithms Illustrated (Part 1 - Chapter 1)

#DSA

Problem 1.1

The question asks about the state of two subarrays after the two outermost recursive calls in a merge-sort.

The original array is: [5, 3, 8, 9, 1, 7, 0, 2, 6, 4]

Solution

The two outhermost recursive calls are made on the two halves of the original array. Merge sort is a divide-and-conquer algorithm hence it relies on the fact that at any given recursion level, the two subarrays returned by the recursive calls are individually sorted.

  • Before the two outermost recursive calls:

    Left subarray: [5, 3, 8, 9, 1]

    Right subarray: [7, 0, 2, 6, 4]

    In this case both the subarrays are of same length otherwise we need to make assumption on which subarray can be odd. That assumption won't change the result.

  • After the two outermost recursive calls:

    Left subarray: [1, 3, 5, 8, 9]

    Right subarray: [0, 2, 4, 6, 7]

Hence, the answer to the original question: number at the 7th position after gluing both the individually sorted subarrays is 2.


Problem 1.2

The question asks about the running time of a slightly modified merge sort algorithm. Divide the input array into thirds (rather than halves), recursively sort each third, and finally combine the results using a three-way Merge subroutine.

Solution

  • In the recursion tree with the root as the input array, then at a given level ll (l=0l=0 for root level) if size of the subarray is mm,
m3l=n    m=n3l\begin{equation} m \cdot 3^l = n \implies m = \frac{n}{3^l} \end{equation}
  • At level (l1)(l-1), for each three-way merge, total comparisons required, in worst case, is 23n3l2 \cdot 3 \cdot \frac{n}{3^l}. Hence, total no. comparisons at level (l1)(l-1) is:
23n3l3l1=2n\begin{equation} 2 \cdot 3 \cdot \frac{n}{3^l} \cdot 3^{l-1} = 2 \cdot n \end{equation}
  • If total no. of levels is L:
3L=n    L=log2n/log23\begin{equation} 3 ^ L = n \implies L = \log_2{n}/\log_2{3} \end{equation}
  • Hence, if we assume sequential execution, total running time, approximetely, is:
T(n)=L2n=2nlog23log2n=O(nlogn)\begin{equation} T(n) = L \cdot 2 \cdot n = \frac{2n}{\log_2{3}} \cdot \log_2{n} = O(n \log{n}) \end{equation}

Problem 1.3

In this problem, we are given kk sorted arrays, each containing nn elements. We need to merge them into a single sorted array of knk \cdot n elements. For each array of size nn, merge it with previously merged array of size ini \cdot n where ii is the index of current array (0-indexed).

Solution

  • Using the merge procedure from merge sort to merge two sorted arrays of sizes aa and bb has time complexity O(a+b)O(a+b). Hence total time taken to merge kk arrays is:
T(n,k)=n+2n+3n+...+kn=(1+2+3+...+k)n=k(k+1)2n=O(k2n)\begin{equation} \begin{aligned} T(n, k) & = n + 2 \cdot n + 3 \cdot n + ... + k \cdot n \\ & = (1 + 2 + 3 + ... + k) \cdot n \\ & = \frac{k(k+1)}{2} \cdot n = O(k^2 \cdot n) \end{aligned} \end{equation}

Problem 1.4

This is same problem as Problem 1.3 but we need to use different merging procedure. It divides the kk arrays into k/2k / 2 pairs, merges each pair, resulting in k/2k / 2 sorted arrays of length 2n2n, and repeats this process until only one sorted array remains.

Solution

  • Before each merging step, we have kk arrays of size nn. After first merging step, we have k/2k/2 arrays of size 2n2n. After second merging step, we have k/4k/4 arrays of size 4n4n. Continuing this way, after ithi^{th} merging step, we have k/2ik / 2^i arrays of size 2in2^i \cdot n.

  • Hence time taken at each merging step ii is O(kn)O(k \cdot n).

  • Total merging steps required to get a single sorted array is log2k\log_2{k}.

  • Hence total time taken to merge kk arrays is:

T(n,k)=O(knlog2k)\begin{equation} T(n, k) = O(k \cdot n \cdot \log_2{k}) \end{equation}

Problem 1.5

In this problem, we are given an unsorted array of nn distinct numbers, and we need to give an algorithm to find the 2nd2nd largest number in the array using at most n+log2n2n + \log_2{n} - 2 comparisons. For simplicity, we can assume that nn is a power of 22.

Solution

Idea: Using tournament method. Pair up the elements and compare them. The larger element moves to the next round. Continue this until we find the largest element. The second largest element must be one of the elements that lost to the largest element in each round.

  • Use merge sort to create a tournament tree. This will take n1n - 1 comparisons to find the largest element.
  • To find the second largest element, we need to look at the elements that lost to the largest element. The largest element will have competed in log2n\log_2{n} rounds, so there are log2n\log_2{n} candidates for the second largest element.
  • We can find the largest among these candidates with log2n1\log_2{n} - 1 comparisons.
  • Hence, total comparisons made is:
T(n)=(n1)+(log2n1)=n+log2n2\begin{equation} T(n) = (n - 1) + (\log_2{n} - 1) = n + \log_2{n} - 2 \end{equation}

Python Code Implementation:

def find_second_largest(arr):
    comparisons = {}

    def tournament(arr):
        if len(arr) == 1:
            return arr[0]
        mid = len(arr) // 2
        left_winner = tournament(arr[:mid])
        right_winner = tournament(arr[mid:])
        if left_winner > right_winner:
            comparisons[left_winner] = comparisons.get(left_winner, []) + [right_winner]
            if right_winner in comparisons:
                del comparisons[right_winner]
            return left_winner
        # distinct elements assumption
        else:
            comparisons[right_winner] = comparisons.get(right_winner, []) + [left_winner]
            if left_winner in comparisons:
                del comparisons[left_winner]
            return right_winner

    largest = tournament(arr)
    candidates = comparisons[largest]
    second_largest = max(candidates)

    return second_largest

Proof of correctness

  • In tournament recursion tree at the last recursion level, we have pairs of elements compared. The larger element from each pair moves up the tree. These moved up elements are the only candidates for the largest element. Is this is not the case then one of the pairs returned the smaller element which contradicts the if-else case. Using induction, we can say that at each level of recursion tree, the larger elements move up and smaller elements are discarded. Hence, at the root of the recursion tree, we have the largest element.
  • Proof that second largest element is among the elements that lost to the largest element: At each recursion level,
    • Case 1: 2nd largest element is paired with largest element. In this case, it loses to the largest element and is among the candidates.
    • Case 2: 2nd largest element is paired with some other element. In this case, it wins and moves up the tree. At some later level, or ultimately after the two outermost recursions, it will be paired with the largest element and lose to it. Hence, it will be among the candidates.
  • At each level of recursion, largest element competes with one element and loses to none. Hence, total no. of candidates for 2nd largest element is log2n\log_2{n}. Finding the largest among these candidates will give us the 2nd largest element in the original array.