考虑到《算法设计与分析(双语)》使用英文教学,本笔记采用全英文进行记录。

Considering that in the course “Algorithm Design and Analysis”, array indices start from 1, while Python uses 0-based indexing, there may be minor discrepancies between the code and pseudocode. However, this does not affect the correctness of either approach.

Binary Search

Let A[1..n]A[1..n] be a sequence of nn elements. Consider the problem of determining whether a given element xx is in AA. This problem can be rephrased as follows. Find an index jj, 1jn1\leq j\leq n, such that x=A[j]x=A[j] if xx is in AA, and j=0j=0 otherwise.A straightforward approach is to scan the entries in AA and compare each entry with xx. If after jj comparisons, 1jn1\leq j\leq n, the search is successful, i.e., x=A[j]x=A[j], jj is returned; otherwise, a value of 0 is returned indicating an unsuccessful search.This method is referred to as sequential search. It is also called linear search, as the maximum number of element comparisons grows linearly with the size of the sequence. This is shown as Algorithm LINEARSEARCH.

\begin{algorithm}
    \caption{LINEARSEARCH}
    \begin{algorithmic}
            \Input An array $A[1..n]$ of n elements and an element $x$.
            \Output $j$ if $x = A[j]$, $1 \leq j \leq n$, and $0$ otherwise.
            \State $j \gets 1$
            \While{$j < n$ \And $x \neq A[j]$}
                \State $j \gets j+1$
        \EndWhile
        \If{$x = A[j]$}
        \Return $j$
        \Else \Return 0
        \EndIf
      \end{algorithmic}
    \end{algorithm}
    

To improve search efficiency, we use an effective strategy known as binary search. Algorithm BINARYSEARCH gives a more formal description of this method.

\begin{algorithm}
\caption{BINARYSEARCH}
\begin{algorithmic}
\Input An array $A[1..n]$ of $n$ elements sorted in nondecreasing order and an element $x$.
\Output $j$ if $x = A[j]$, $1 \leq j \leq n$, and $0$ otherwise.
\State $low \gets 1$; $high \gets n$; $j \gets 0$
\While{$low \leq high$ and $j = 0$}
    \State $mid \gets \lfloor (low + high) / 2 \rfloor$
    \If{$x = A[mid]$}
        \State $j \gets mid$
    \ElsIf{$x < A[mid]$}
        \State $high \gets mid - 1$
    \Else
        \State $low \gets mid + 1$
    \EndIf
\EndWhile
\Return $j$
\end{algorithmic}
\end{algorithm}

Below is the implementation of the pseudocode in Python:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
def linear_search(arr, target):
for i in range(len(arr)):
if arr[i] == target:
return i
return 0

def binear_search(arr, target):
left, right = 0, len(arr) - 1
while left <= right:
mid = (left + right) // 2

if arr[mid] == target:
return mid
elif arr[mid] < target:
left = mid + 1
else:
right = mid - 1
return 0

We will assume that each three-way comparison (if-then-else) counts as one comparison.
Theorem 1.1 The number of comparisons performed by Algorithm BINARYSEARCH on a sorted array of size nn is at most logn+1\lfloor \log n \rfloor + 1.

Merging Two Sorted Lists

\begin{algorithm}
\caption{MERGE}
\begin{algorithmic}
\Input An array $A[1..m]$ of elements and three indices $p,q$, and $r$, with $1 \leq p \leq q < r \leq m$, such that both the subarrays $A[p..q]$ and $A[q+1..r]$ are sorted individually in nondecreasing order.
\Output $A[p..r]$ contains the result of merging the two subarrays $A[p..q]$ and $A[q+1..r]$.
\State \textbf{comment:} $B[p..r]$ is an auxiliary array.
\State $s \gets p$; $t \gets q+1$; $k \gets p$
\While{$s \leq q$ and $t \leq r$}
    \If{$A[s] \leq A[t]$}
        \State $B[k] \gets A[s]$
        \State $s \gets s + 1$
    \Else
        \State $B[k] \gets A[t]$
        \State $t \gets t + 1$
    \EndIf
    \State $k \gets k + 1$
\EndWhile
\If{$s = q + 1$}
    \State $B[k..r] \gets A[t..r]$
\Else
    \State $B[k..r] \gets A[s..q]$
\EndIf
\State $A[p..r] \gets B[p..r]$
\end{algorithmic}
\end{algorithm}

Below is the implementation of the pseudocode in Python:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
def merge(arr,p,q,r):
B = [0] * (r - p + 1)
s = p
t = q + 1
k = p
while s <= q and t <= r:
if arr[s] <= arr[t]:
B[k - p] = arr[s]
s += 1
else:
B[k - p] = arr[t]
t += 1
k += 1

if s == q+1:
while t <= r:
B[k - p] = arr[t]
t += 1
k += 1
else:
while s <= q:
B[k - p] = arr[s]
s += 1
k += 1

for i in range(r - p + 1):
arr[p + i] = B[i]
return arr

Sort#1

The function to verify the correctness of the sorting is as follows:

1
2
3
4
5
6
def test(sorting_function:callable):
import numpy as np
for _ in range(30):
arr = np.random.randint(0, 8000, size=1000).tolist()
sorted_arr = sorted(arr)
assert sorting_function(arr) == sorted_arr

Selection Sort

\begin{algorithm}
\caption{SELECTIONSORT}
        \begin{algorithmic}
        \Input  An array $A[1..n]$ of $n$ elements.
        \Output $A[1..n]$ sorted in nondecreasing order.
        \For{$i \gets 1$ \To $n-1$}
            \State $k \gets i$
            \For{$j \gets i+1$ \To $n$}
                \Comment{Find the $i$th smallest element.}
                \If{$A[j] < A[k]$}
                    \State $k \gets j$
                \EndIf
            \EndFor
            \If{$k \neq i$}
                \State \textit{interchange} $A[i]$ and $A[k]$
            \EndIf
        \EndFor
    \end{algorithmic}
\end{algorithm}

Below is the implementation of the pseudocode in Python:

1
2
3
4
5
6
7
8
9
10
def selection_sort(arr):
n = len(arr)
for i in range(n):
k = i
for j in range(i + 1, n):
if arr[j] < arr[k]:
k = j
if k != i:
arr[i], arr[k] = arr[k], arr[i]
return arr

Insertion Sort

\begin{algorithm}
\caption{INSERTIONSORT}
        \begin{algorithmic}
        \Input  An array $A[1..n]$ of $n$ elements.
        \Output $A[1..n]$ sorted in nondecreasing order.
        \For{$i \gets 2$ \To $n$}
                \State $x \gets A[i]$
                \State $j \gets i-1$
                \While{$j > 0$ \And ($A[j] > x$)}
                        \State $A[j+1] \gets A[j]$
                        \State $j \gets j-1$
        \EndWhile
        \State $A[j+1] \gets x$
        \EndFor
    \end{algorithmic}
\end{algorithm}

Below is the implementation of the pseudocode in Python:

1
2
3
4
5
6
7
8
9
10
def insertion_sort(arr):
n = len(arr)
for i in range(1,n):
x = arr[i]
j = i - 1
while j >= 0 and arr[j] > x:
arr[j+1] = arr[j]
j = j - 1
arr[j+1] = x
return arr

Bottom-up Merge Sorting

\begin{algorithm}
\caption{BOTTOMUPSORT}
        \begin{algorithmic}
        \Input  An array $A[1..n]$ of $n$ elements.
        \Output $A[1..n]$ sorted in nondecreasing order.
        \State $t \gets 1$
        \While{$t < n$}
                \State $s \gets t; \quad t \gets 2s; \quad i \gets 0$
                \While{$i+t \leq n$}
                        \State MERGE($A$, $i + 1$, $i + s$, $i + t$)
                        \State $i \gets i+t$
            \EndWhile
            \If{$i+s < n$}
            \State MERGE($A$, $i + 1$, $i + s$, n)
            \EndIf
    \EndWhile
    \end{algorithmic}
\end{algorithm}

Below is the implementation of the pseudocode in Python:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
def bottom_up_sort(arr):
n = len(arr)
t = 1

while t < n:
s = t
t = 2 * s
i = 0
while i + t <= n:
merge(arr, i, i + s - 1, i + t - 1)
i += t
if i + s < n:
merge(arr, i, i + s - 1, n - 1)

return arr

def merge(arr,p,q,r):
B = [0] * (r - p + 1)
s = p
t = q + 1
k = p
while s <= q and t <= r:
if arr[s] <= arr[t]:
B[k - p] = arr[s]
s += 1
else:
B[k - p] = arr[t]
t += 1
k += 1

if s == q+1:
while t <= r:
B[k - p] = arr[t]
t += 1
k += 1
else:
while s <= q:
B[k - p] = arr[s]
s += 1
k += 1

for i in range(r - p + 1):
arr[p + i] = B[i]
return arr

Complexity

O-notation Ω\Omega-notation Θ\Theta-notation

In order to formalize the notions of time complexity, special mathematical notations have been widely used.

  • O-notation: An upper bound on the running time. The running time of an algorithm is O(g(n))O(g(n)), if whenever the input size is equal to or exceeds some threshold n0n_0, its running time can be bounded above by some positive constant cc times g(n)g(n).
  • Ω\Omega-notation: A lower bound on the running time. The running time of an algorithm is Ω(g(n))\Omega(g(n)), if whenever the input size is equal to or exceeds some threshold n0n_0, its running time can be bounded below by some positive constant cc times g(n)g(n).
  • Θ\Theta-notation: An exact bound. The running time of an algorithm is of order Θ(g(n))\Theta(g(n)) if whenever the input size is equal to or exceeds some threshold n0n_0, its running time can be bounded below by c1g(n)c_1g(n) and above by c2g(n)c_2g(n), where 0<c1c20<c_1\leq c_2.

(1) f(n)f(n) is Ω(g(n))\Omega(g(n)) if and only if g(n)g(n) is O(f(n))O(f(n))
(2) f(n)=Θ(g(n))f(n) = \Theta(g(n)) if and only if f(n)=O(g(n))f(n) = O(g(n)) and f(n)=Ω(g(n))f(n) = \Omega(g(n))

Time Complexity

When analyzing the running time,

  • We usually compare its behavior with another algorithm that solves the same problem.
  • It is desirable for an algorithm to be not only machine independent, but also capable of being expressed in any language.
  • It should be technology independent, that is, we want our measure of the running time of an algorithm to survive technological advances.
  • Our main concern is not in small input sizes; we are mostly concerned with the behavior of the algorithm on large input instances.
    Therefore, usually only Elementary Operations are used to evaluate the time complexity.
    Elementary Operation: Any computational step whose cost is always upperbounded by a constant amount of time regardless of the input data or the algorithm used.

Space Complexity

We define the space used by an algorithm to be the number of memory cells (or words) needed to carry out the computational steps required to solve an instance of the problem excluding the space allocated to hold the input. That is, it is only the work space required by the algorithm.
The work space cannot exceed the running time of an algorithm, as writing into each memory cell requires at least a constant amount of time.
Let T(n)T(n) and S(n)S(n) denote, respectively, the time and space complexities of an algorithm, then S(n)=O(T(n))S(n) = O(T(n)).

Worst Case and Average Case Analysis

Worst case analysis: Select the maximum cost among all possible inputs.
Average case analysis: It is necessary to know the probabilities of all input occurrences, i.e., it requires prior knowledge of the input distribution.

Amortized Analysis

Consider an algorithm in which an operation is executed repeatedly with the property that its running time fluctuates throughout the execution of the algorithm. If this operation takes a large amount of time occasionally and runs much faster most of the time, then this is an indication that amortized analysis should be employed, assuming that an exact bound is too hard, if not impossible.
In amortized analysis, we average out the time taken by the operation throughout the execution of the algorithm and refer to this average as the amortized running time of that operation. No assumptions about the probability distribution of the input are needed.