problem statement.

Typically, we assume that basic operations on natural numbers (e.g., adding or multiplying two natural numbers together) are performed in constant time. In practice, this assumption is correct whenever we restrict ourselves to natural numbers with some maximum size (e.g., 64 bit natural numbers, for which basic operations are supported directly by modern processors). Applications such as cryptography often work with huge natural numbers, however (e.g., 4048 bit values, which can hold a maximum of 3.7101218\approx 3.7 \cdot 10^{1218}). Hence, for these applications we can no longer assume that operations on natural numbers are in constant time: these applications require the development of efficient algorithms even for basic operations on natural numbers.

Consider two nn-digit natural numbers A=a1anA = a_{1} \dots a_{n} and B=b1bnB = b_{1} \dots b_{n} written in base 10: the digits a1ana_{1} \dots a_{n} and b1bnb_{1} \dots b_{n} each have a value in 090 \dots 9. For example, if n=4n=4, then we could have A=3456,B=9870A=3456, B=9870, in which case a1=3,a2=4,a3=5,a6=6,b1=9,b2=8,b3=7,b4=0a_{1}=3, a_{2}=4, a_{3}=5, a_{6}=6, b_{1}=9, b_{2}=8, b_{3}=7, b_{4}=0.

1.1

Write an algorithm ADD(A, B) that computes A+BA + B in Θ(n)\Theta(n). Explain why your algorithm is correct and the runtime complexity is Θ(n)\Theta(n).

Assumption: one converts AA and BB into two arrays of nn integers, A=[a1an]A = \lbrack a_{1} \dots a_{n} \rbrack and B=[b1bn]B = \lbrack b_{1} \dots b_{n} \rbrack.

"\\begin{algorithm}\n\\caption{ADD(A, B)}\n\\begin{algorithmic}\n\\INPUT $A \\coloneqq \\lbrack a_{1} \\dots a_{n} \\rbrack$\n\\INPUT $B \\coloneqq \\lbrack b_{1} \\dots b_{n} \\rbrack$\n\\STATE $C \\gets \\lbrack \\space \\rbrack \\text{ where } |C| = n + 1$\n\\STATE $carry \\gets 0$\n\\STATE $i \\gets n-1$\n\\WHILE{$i \\geq 0$}\n \\STATE $C[i+1] \\gets (a_{i} + b_{i} + carry) \\mod 10$\n \\STATE $carry \\gets \\lfloor (a_{i} + b_{i} + carry) / 10 \\rfloor$\n \\STATE $i \\gets i - 1$\n\\ENDWHILE\n\\STATE $C[0] \\gets carry$\n\\IF{$C[0] == 0$}\n \\STATE $C \\gets C[1 \\dots n]$\n\\ENDIF\n\\OUTPUT $C$\n\\end{algorithmic}\n\\end{algorithm}"

Algorithm 20 ADD(A, B)

Input: A[a1an]A \coloneqq \lbrack a_{1} \dots a_{n} \rbrack

Input: B[b1bn]B \coloneqq \lbrack b_{1} \dots b_{n} \rbrack

1:C[ ] where C=n+1C \gets \lbrack \space \rbrack \text{ where } |C| = n + 1

2:carry0carry \gets 0

3:in1i \gets n-1

4:while i0i \geq 0 do

5:C[i+1](ai+bi+carry)mod10C[i+1] \gets (a_{i} + b_{i} + carry) \mod 10

6:carry(ai+bi+carry)/10carry \gets \lfloor (a_{i} + b_{i} + carry) / 10 \rfloor

7:ii1i \gets i - 1

8:end while

9:C[0]carryC[0] \gets carry

10:if C[0]==0C[0] == 0 then

11:CC[1n]C \gets C[1 \dots n]

12:end if

Output: CC

Runtime complexity: Θ(n)\Theta(n)

  • L1 takes Θ(n)\Theta(n) time to initialise.
  • while loop iterates nn times, each iteration perform constant time operations (additions, modulo, division) in Θ(1)\Theta(1) time.
  • Finally, the adjustment of the output array CC takes Θ(1)\Theta(1) time.

Thus, total runtime complexity is Θ(n)\Theta(n).

Correctness:

Invariants:

0in1, i+2jncn1=0 c=k=i+1n1(ak+bk+ck)10nk1mod10 C[i+1]=(ai+bi+c)mod10 C[j]=((aj1+bj1+cj1)mod10)\begin{align} 0 \leq i \leq n-1, & \space i+2 \leq j \leq n \land c_{n-1} = 0 \\\ \quad c &= \lfloor \frac{\sum_{k=i+1}^{n-1}(a_k + b_k + c_k)}{10^{n-k-1}} \rfloor \mod 10 \\\ \quad C[i+1] &= (a_i + b_i + c) \mod 10 \\\ \quad C[j] &= ((a_{j-1} + b_{j-1} + c_{j-1}) \mod 10) \end{align}

where cc defines as the carry value resulting from the addition.

bound function f(i)=Aif(i) = |A| - i starts at A,A0|A|, |A| \geq 0

Proof

Base case: i=n1i = n-1 (L2,3)

Invariant for carry holds, as ci=cn1=0c_{i} = c_{n-1} = 0

Now we will prove these invariants still hold til reach the end of m-th loop:

Assuming the invariants hold at the start of m-th loop, or:

0mn1 cm=k=mn1(ak+bk+ck)10nk1mod10 C[m+1]=(am+bm+cm)mod10 C[j]=((aj1+bj1+cj1)mod10)\begin{align*} 0 \leq &m \leq n-1 \\\ c_m &= \lfloor \frac{\sum_{k=m}^{n-1}(a_k + b_k + c_k)}{10^{n-k-1}} \rfloor \mod 10 \\\ \quad C[m+1] &= (a_m + b_m + c_m) \mod 10 \\\ \quad C[j] &= ((a_{j-1} + b_{j-1} + c_{j-1}) \mod 10) \end{align*}

L4-7: The while loop.

  • Carry forward invariants holds cm1=cnew=(am+bm+cm)10mod10c_{m-1} = c_{\text{new}} = \lfloor \frac{(a_m + b_m + c_m)}{10} \rfloor \mod 10
  • C[m+1]=(am+bm+cm)mod10C[m+1] = (a_m + b_m + c_m) \mod 10, or C[m+1]C[m+1] holds correct digits after addition of am,bma_m, b_m and carry cmc_m
  • f(i)f(i) strictly decreases after each iteration, inew:=i+1i_{\text{new}} := i + 1

Therefore the invariants holds.

1.2

What is the runtime complexity of this algorithm in terms of the number of digits in A and B?

Runtime complexity is Θ(n2)\Theta(n^2), where nn is the number of digits in AA and BB.

For each digits of BB, it multiply every digits of AA, which results in n2n^2 operations.

Each addition operation takes at most 2n2n digit additions, and we perform nn of these additions, therefore resulting in O(n2)O(n^2) time.

Overall, pen-and-paper addition of two nn-digit numbers takes Θ(n2)\Theta(n^2) time.

1.3

Let CC be an nn-digit number with n=2mn=2m. Hence, C=Chigh10m+ClowC = C_{\text{high}} \cdot 10^m + C_{\text{low}} where ChighC_{\text{high}} the first mm digits of C and ClowC_{\text{low}} is the remaining mm digits of C. For example, if n=4,A=3456,B=9870n=4, A=3456, B=9870, then m=2m=2 and

A=Ahigh10m+Alow,Ahigh=34,Alow=56 B=Bhigh10m+Blow,Bhigh=98,Blow=70\begin{aligned} &A=A_{\text{high}} \cdot 10^m + A_{\text{low}}, &A_{\text{high}} = 34,\quad &A_{\text{low}} = 56 \\\ &B=B_{\text{high}} \cdot 10^m + B_{\text{low}}, &B_{\text{high}} = 98,\quad &B_{\text{low}} = 70 \end{aligned}

Using the breakdown of a number into their high and low part, one notices the following

A×B=(Ahigh10m+Alow)(Bhigh10m+Blow) =Ahigh×Bhigh102m+(Ahigh×Blow+Alow×Bhigh)10m+Alow×Blow\begin{aligned} A \times B &= (A_{\text{high}} \cdot 10^m + A_{\text{low}}) \cdot (B_{\text{high}} \cdot 10^m + B_{\text{low}}) \\\ & = A_{\text{high}} \times B_{\text{high}} \cdot 10^{2m} + (A_{\text{high}} \times B_{\text{low}} + A_{\text{low}} \times B_{\text{high}}) \cdot 10^m + A_{\text{low}} \times B_{\text{low}} \end{aligned}

Here is the following recursive algorithm BREAKSDOWNMULTIPLY(A, B) that computes A×BA \times B:

"\\begin{algorithm}\n\\caption{BREAKSDOWNMULTIPLY(A, B)}\n\\begin{algorithmic}\n\\INPUT $A \\text{ and } B \\text{ have } n=2m \\text{ digits}$\n\\IF{$n = 1$}\n \\RETURN $a_{1} \\times b_{1}$\n\\ELSE\n \\STATE $hh \\coloneqq \\text{BREAKSDOWNMULTIPLY}(A_{\\text{high}}, B_{\\text{high}})$\n \\STATE $hl \\coloneqq \\text{BREAKSDOWNMULTIPLY}(A_{\\text{high}}, B_{\\text{low}})$\n \\STATE $lh \\coloneqq \\text{BREAKSDOWNMULTIPLY}(A_{\\text{low}}, B_{\\text{high}})$\n \\STATE $ll \\coloneqq \\text{BREAKSDOWNMULTIPLY}(A_{\\text{low}}, B_{\\text{low}})$\n \\RETURN $hh \\cdot 10^{2m} + (hl + lh) \\cdot 10^m + ll$\n\\ENDIF\n\\RETURN $A \\times B$\n\\end{algorithmic}\n\\end{algorithm}"

Algorithm 21 BREAKSDOWNMULTIPLY(A, B)

Input: A and B have n=2m digitsA \text{ and } B \text{ have } n=2m \text{ digits}

1:if n=1n = 1 then

2:return a1×b1a_{1} \times b_{1}

3:else

4:hhBREAKSDOWNMULTIPLY(Ahigh,Bhigh)hh \coloneqq \text{BREAKSDOWNMULTIPLY}(A_{\text{high}}, B_{\text{high}})

5:hlBREAKSDOWNMULTIPLY(Ahigh,Blow)hl \coloneqq \text{BREAKSDOWNMULTIPLY}(A_{\text{high}}, B_{\text{low}})

6:lhBREAKSDOWNMULTIPLY(Alow,Bhigh)lh \coloneqq \text{BREAKSDOWNMULTIPLY}(A_{\text{low}}, B_{\text{high}})

7:llBREAKSDOWNMULTIPLY(Alow,Blow)ll \coloneqq \text{BREAKSDOWNMULTIPLY}(A_{\text{low}}, B_{\text{low}})

8:return hh102m+(hl+lh)10m+llhh \cdot 10^{2m} + (hl + lh) \cdot 10^m + ll

9:end if

10:return A×BA \times B

Prove that algorithm BREAKSDOWNMULTIPLY(A, B) is correct.

The proposed BREAKSDOWNMULTIPLY(A, B) is a variant of Karatsuba’s algorithm.

Base case: m=1    n=2m=1 \implies n=2, which implies A×BA \times B are correct (multiplication of two two-digits number).

Through recursions, at any level k,k=log2n,nk=2kmk, k = \log_2 n, n_k = 2^k \cdot m, one would observe:

  • Ak=Ahighk10mk+AlowkA_k = A_{\text{high}_k} \cdot 10^{m_k} + A_{\text{low}_k}
  • Bk=Bhighk10mk+BlowkB_k = B_{\text{high}_k} \cdot 10^{m_k} + B_{\text{low}_k}

The recursive call hhk,hlk,lhk,llkhh_k, hl_k, lh_k, ll_k correctly computes the product of Ak×BkA_k \times B_k til the base case.

The combination of the products is proven through previous math steps, therefore, the algorithm is correct.

1.4

Give a recurrence T(n)T(n) for the runtime complexity of BREAKSDOWNMULTIPLY(A, B) Explain each term in the recurrence.

Draw a recurrence tree for T(n)T(n) and use this recurrence tree to solve the recurrence T(n)T(n) by proving that T(n)=Θ(f(n))T(n) = \Theta (f(n)) for some function f(n)f(n)

What is the runtime complexity of BREAKSDOWNMULTIPLY(A, B)? Do you expect this algorithm to be faster than the pen-and-paper multiplication algorithm? Hint: Feel free to assume that n=2k,kNn = 2^k, k \in \mathbb{N}. Feel free to assume that we can add two vv-digit number in Θ(v)\Theta(v) (e.g., using ADD) and that we can multiply a vv-digit number with 10w10^w in Θ(v+w)\Theta (v+w).

For two nn digits number AA and BB, the recurrent T(n)T(n) is:

T(n)={Θ(1)if n=1 4T(n/2)+Θ(n)if n>1T(n) = \begin{cases} \Theta(1) & \text{if } n = 1 \\\ 4T(n/2) + \Theta(n) & \text{if } n > 1 \end{cases}
  • The base case when n=1n=1 is Θ(1)\Theta(1), as it only performs a single digit multiplication, without no recursive calls.
  • The recursive case when n>1n>1 performs 4 recursive calls, each with n/2n/2 digits, each on number half the size of original input (since n=2mn=2m), hence 4T(n/2)4T(n/2).
  • Θ(n)\Theta(n) is the linear time complexity adding the products of the recursive calls, per our assumption that we can multiply a vv-digit number with 10w10^w in Θ(v+w)\Theta(v+w).

The recurrence tree for T(n)T(n) is:

T(n)
├── T(n/2)
   ├── T(n/4)
   ├── T(n/8)
   ├── ...
   ...
   ├── ...
   └── ...
  ...
   └── T(n/8)
   ├── T(n/4)
   ├── ...
   ├── ...
   ├── ...
   └── ...
   ├── T(n/4)
   ├── ...
   ├── ...
   ├── ...
   └── ...
   └── T(n/4)
       ├── ...
       ├── ...
       ├── ...
       └── ...
├── T(n/2)
   ├── ...
   ├── ...
   ├── ...
   └── ...
├── T(n/2)
   ├── ...
   ├── ...
   ├── ...
   └── ...
└── T(n/2)
    ├── ...
    ├── ...
    ├── ...
    └── ...
  • The total number of nodes at depth kk is 4k4^k, since each level of recursion calls the function four times.
  • Work done at level kk is 4kn/2k=2kn4^k \cdot n/2^k = 2^k \cdot n, since work done per depth is nn times the number of nodes add that depth.
  • Depth of the tree is log2n\log_2 n, since the input size is halved at each level.

Therefore, one can solve for T(n)T(n):

T(n)=k=0log2(n)2kn =nk=0log2(n)2k =n2log2(n)+1121 =n(2n1) =2n2n =Θ(n2)\begin{aligned} T(n) &= \sum_{k=0}^{\log_2(n)} 2^k \cdot n \\\ &= n \cdot \sum_{k=0}^{\log_2(n)} 2^k \\\ &= n \cdot \frac{2^{\log_2(n) + 1} - 1}{2 - 1} \\\ &= n \cdot (2n - 1) \\\ &= 2n^2 - n \\\ &= \Theta(n^2) \end{aligned}

Thus the runtime complexity of BREAKSDOWNMULTIPLY(A, B) is quadratic, Θ(n2)\Theta(n^2).

From here, the algorithm is the same as the pen-and-paper multiplication algorithm, which also takes Θ(n2)\Theta(n^2) time.

1.5

One can observe

(Ahigh+Alow)×(Bhigh+Blow)=Ahigh×Bhigh+Ahigh×Blow+Alow×Bhigh+Alow×Blow(A_{\text{high}} + A_{\text{low}}) \times (B_{\text{high}} + B_{\text{low}}) = A_{\text{high}} \times B_{\text{high}} + A_{\text{high}} \times B_{\text{low}} + A_{\text{low}} \times B_{\text{high}} + A_{\text{low}} \times B_{\text{low}}

Hence by rearranging terms, one can conclude that

Ahigh×Blow+Alow×Bhigh=(Ahigh+Alow)×(Bhigh+Blow)Ahigh×BhighAlow×BlowA_{\text{high}} \times B_{\text{low}} + A_{\text{low}} \times B_{\text{high}} = (A_{\text{high}} + A_{\text{low}}) \times (B_{\text{high}} + B_{\text{low}}) - A_{\text{high}} \times B_{\text{high}} - A_{\text{low}} \times B_{\text{low}}

Based on conclusion above, A×BA \times B can be seen as:

A×B=(Ahigh10m+Alow)×(Bhigh10m+Blow)=Ahigh×Bhigh102m+Ahigh×Blow10m+Alow×Bhigh10m+Alow×Blow=Ahigh×Bhigh102m+(Ahigh×Blow+Alow×Bhigh)10m+Alow×Blow=Ahigh×Bhigh102m+(((Ahigh+Alow)×(Bhigh+Blow))(Ahigh×Bhigh)(Alow×Blow))10m+Alow×Blow.\begin{aligned} A \times B &= (A_{\text{high}} \cdot 10^m + A_{\text{low}}) \times (B_{\text{high}} \cdot 10^m + B_{\text{low}}) \\ &= A_{\text{high}} \times B_{\text{high}} \cdot 10^{2m} + A_{\text{high}} \times B_{\text{low}} \cdot 10^m + A_{\text{low}} \times B_{\text{high}} \cdot 10^m + A_{\text{low}} \times B_{\text{low}} \\ &= A_{\text{high}} \times B_{\text{high}} \cdot 10^{2m} + (A_{\text{high}} \times B_{\text{low}} + A_{\text{low}} \times B_{\text{high}}) \cdot 10^m + A_{\text{low}} \times B_{\text{low}} \\ &= A_{\text{high}} \times B_{\text{high}} \cdot 10^{2m} + \left(\left((A_{\text{high}} + A_{\text{low}}) \times (B_{\text{high}} + B_{\text{low}})\right) - \left(A_{\text{high}} \times B_{\text{high}}\right) - \left(A_{\text{low}} \times B_{\text{low}}\right)\right) \cdot 10^m + A_{\text{low}} \times B_{\text{low}}. \end{aligned}

The final rewritten form of A×BA \times B only requires three multiplication terms, namely Ahigh×Bhigh,Alow×Blow,(Ahigh+Alow)×(Bhigh+Blow)A_{\text{high}} \times B_{\text{high}}, A_{\text{low}} \times B_{\text{low}}, (A_{\text{high}} + A_{\text{low}}) \times (B_{\text{high}} + B_{\text{low}})

Use the observation to construct a recursive multiplication SMARTMATHSMULTIPLY(A, B) that only perform three recursive multiplications. Argue why SMARTMATHSMULTIPLY(A, B) is correct.

"\\begin{algorithm}\n\\caption{SMARTMATHSMULTIPLY(A, B)}\n\\begin{algorithmic}\n\\INPUT $A \\text{ and } B \\text{ have } n=2m \\text{ digits}$\n\\IF{$n = 1$}\n \\RETURN $a_{1} \\times b_{1}$\n\\ELSE\n \\STATE $hh \\coloneqq \\text{SMARTMATHSMULTIPLY}(A_{\\text{high}}, B_{\\text{high}})$\n \\STATE $ll \\coloneqq \\text{SMARTMATHSMULTIPLY}(A_{\\text{low}}, B_{\\text{low}})$\n \\STATE $mid \\coloneqq \\text{SMARTMATHSMULTIPLY}(A_{\\text{high}} + A_{\\text{low}}, B_{\\text{high}} + B_{\\text{low}})$\n \\RETURN $hh \\cdot 10^{2m} + (mid - hh - ll) \\cdot 10^m + ll$\n\\ENDIF\n\\RETURN $A \\times B$\n\\end{algorithmic}\n\\end{algorithm}"

Algorithm 22 SMARTMATHSMULTIPLY(A, B)

Input: A and B have n=2m digitsA \text{ and } B \text{ have } n=2m \text{ digits}

1:if n=1n = 1 then

2:return a1×b1a_{1} \times b_{1}

3:else

4:hhSMARTMATHSMULTIPLY(Ahigh,Bhigh)hh \coloneqq \text{SMARTMATHSMULTIPLY}(A_{\text{high}}, B_{\text{high}})

5:llSMARTMATHSMULTIPLY(Alow,Blow)ll \coloneqq \text{SMARTMATHSMULTIPLY}(A_{\text{low}}, B_{\text{low}})

6:midSMARTMATHSMULTIPLY(Ahigh+Alow,Bhigh+Blow)mid \coloneqq \text{SMARTMATHSMULTIPLY}(A_{\text{high}} + A_{\text{low}}, B_{\text{high}} + B_{\text{low}})

7:return hh102m+(midhhll)10m+llhh \cdot 10^{2m} + (mid - hh - ll) \cdot 10^m + ll

8:end if

9:return A×BA \times B

The proposed SMARTMATHSMULTIPLY(A, B) is the basis of Karatsuba’s algorithm.

Base case: n=1n=1, which implies A×BA \times B are correct (multiplication of two single digit number).

Assume that SMARTMATHSMULTIPLY(A, B) correctly computes the product of A×BA \times B for A,BA, B with lest than nn digits.

The following invariants hold per recursive call:

  • A=Ahigh10m+AlowB=Bhigh10m+BlowA = A_{\text{high}} \cdot 10^m + A_{\text{low}} \land B = B_{\text{high} \cdot 10^m + B_{\text{low}}} where m=n2m = \frac{n}{2} (true from problem statement and n=2kn=2^k)
  • recursive call computes P1,P2,P3P_{1}, P_{2}, P_{3} correctly, where P1=Ahigh×Bhigh,P2=Alow×Blow,P3=(Ahigh+Alow)×(Bhigh+Blow)P_{1} = A_{\text{high}} \times B_{\text{high}}, P_{2} = A_{\text{low}} \times B_{\text{low}}, P_{3} = (A_{\text{high}} + A_{\text{low}}) \times (B_{\text{high}} + B_{\text{low}}) for numbers fewer than nn digits (from induction hypothesis)
  • combination invariants: P4=P3P2P1A×B=P1102m+P410m+P2P_{4} = P_{3}-P_{2}-P_{1} \land A \times B = P_{1} \cdot 10^{2m} + P_{4} \cdot 10^m + P_{2} (true from previous statement)

Thus, the algorithm is correct.

1.6

Give a recurrence T(n)T(n) for the runtime complexity of SMARTMATHSMULTIPLY(A, B) Explain each term in the recurrence.

Solve the recurrence T(n)T(n) by proving that T(n)=Θ(f(n))T(n) = \Theta (f(n)) for some function f(n)f(n). Use any methods that you find comfortable with.

What is the runtime complexity of SMARTMATHSMULTIPLY(A, B)? Do you expect this algorithm to be faster than the pen-and-paper multiplication algorithm? Hint: Feel free to assume that n=2k,kNn = 2^k, k \in \mathbb{N}. Feel free to assume that we can add two vv-digit number in Θ(v)\Theta(v) (e.g., using ADD) and that we can multiply a vv-digit number with 10w10^w in Θ(v+w)\Theta (v+w).

For two nn digits number AA and BB, the recurrent T(n)T(n) is:

T(n)={Θ(1)if n=1 3T(n/2)+Θ(n)if n>1T(n) = \begin{cases} \Theta(1) & \text{if } n = 1 \\\ 3T(n/2) + \Theta(n) & \text{if } n > 1 \end{cases}
  • The base case when n=1n=1 is Θ(1)\Theta(1), as it only performs a single digit multiplication, without no recursive calls.
  • The recursive case when n>1n>1 performs 3 recursive calls, each with n/2n/2 digits, each on number half the size of original input (since n=2mn=2m), hence 3T(n/2)3T(n/2).
  • Θ(n)\Theta(n) is the linear time complexity adding the products of the recursive calls, per our assumption that we can multiply a vv-digit number with 10w10^w in Θ(v+w)\Theta(v+w).

Using Master Theorem, we can solve for T(n)T(n), with a=3,b=2,f(n)=Θ(n)=nlog23a=3, b=2, f(n)=\Theta (n) = n^{\log_2 3}.

The master theorem states that if f(N)=Θ(Nlogbalogk(N))f(N) = \Theta (N^{\log_b a} \log^{k}(N)), with k>0k>0, then T(N)=Θ(Nlogbalogk+1N)T(N) = \Theta (N^{\log_b a} \log^{k+1} N).

Thus T(n)=Θ(nlog23log(n))=Θ(nlog23)T(n) = \Theta(n^{\log_2 3} \log(n)) = \Theta (n^{\log_2 3})

Runtime complexity of SMARTMATHSMULTIPLY(A, B) > Θ(nlog23)Θ(n1.585)\Theta(n^{\log_2 3}) \approx \Theta (n^1.585)

This algorithm is expected to be faster than the pen-and-paper multiplication algorithm, which also takes Θ(n2)\Theta(n^2) time.