Problem 1
P1.1
Consider an initially-empty stack and the following sequence of operations:
Illustrate the result of each operation (clearly indicate the content of the stack after the operation and, in case of a
POP
, the returned value by the operation
Solution
Stack is initially empty, or
The sequence of operations is as follows:
-
PUSH(S, 3)
: -
POP(S)
: , returned value is -
PUSH(S, 17)
: -
PUSH(S, 5)
: -
PUSH(S, 15)
: -
POP(S)
: , returned value is -
POP(S)
: , returned value is -
POP(S)
: , returned value is
P1.2
Consider an initially-empty queue and the following sequence of operations:
Illustrate the result of each operation (clearly indicate the content of the queue after the operation and, in case of a
DEQUEUE
, the returned value by the operation
Solution
Queue is initially empty, or
The sequence of operations is as follows:
-
ENQUEUE(Q, 3)
: -
DEQUEUE(Q)
: , returned value is -
ENQUEUE(Q, 17)
: -
ENQUEUE(Q, 5)
: -
ENQUEUE(Q, 15)
: -
DEQUEUE(Q)
: , returned value is -
DEQUEUE(Q)
: , returned value is -
DEQUEUE(Q)
: , returned value is
P1.3
Assume we have a stack implementation
MyDynArrayStack
using dynamic arrays: the implementation supportsN
push operations with an amortized runtime complexity of per operation, and the implementation supportsPOP
,EMPTY
, andSIZE
operations with runtime time complexity ofProvide a queue implementation that uses
MyDynArrayStack
and supports any valid sequence ofN
ENQUEUE
andDEQUEUE
operations with an amortized runtime complexity of per operation. Explain why your implementation has the stated amortized runtime complexity forENQUEUE
andDEQUEUE
operations.
Solution
Queue implementation MyDynArrayQueue
using two MyDynArrayStack
, sIn
and sOut
:
where sIn
is used to store elements, and sOut
is used to store elements in reverse order.
Amortized runtime complexity explanation:
-
ENQUEUE
PUSH
tosIn
has an amortized runtime complexity of per operation, as stated in the problem.
-
DEQUEUE
- When
sOut
is empty, the transfer of element fromsIn
tosOut
has a runtime complexity of with worst case. - However, the item was moved once per enqueue-dequeue cycle, so the total time spent is proportional to , thus making amortized runtime complexity of per operation.
- When
Problem 2
Consider the relations courses(prog
, code
, name
) (that models courses named name and identified by the program prog
the course is part of, e.g., SFWRENG, and the course code code, e.g., 2C03
) and enrolled(prog, code, sid)
(that models students with identifier sid
enrolling for a course identified by program prog
and course code code
).
We want to compute the list of all pairs (sid
, name
) in which sid
is a student identifier and name is the name of a course the student with identifier is enrolled in. To compute this list, we developed the following two nested-loop algorithms:
Assume we have significantly more students enrolled for courses than courses (|enrolled| > |courses|
).
P2.1
Assume we are running algorithm
CEJOIN
andECJOIN
on a computer where every instruction takes exactly the same amount of time to execute. Argue whyCEJOIN
must be faster thanECJOIN
when running on computer ?
Solution
Given , the difference between the two lies in the number of iterations of the inner loop.
The outer loop table will only be scanned once, and the inner loop will be scanned for each iteration of the outer loop in the nested-loop algorithm:
CEJOIN
will iterate overenrolled
once, andcourses
over each iteration. Since|enrolled| > |courses|
,CEJOIN
’s inner loop will result in fewer iteration.ECJOIN
will iterate overcourses
once, andenrolled
over each iteration. Since|enrolled| > |courses|
,ECJOIN
’s inner loop will means more iterations, comparing toCEJOIN
.
Thus, we can conclude that CEJOIN
must be faster than ECJOIN
when running on computer \mathbb{C.
P2.2
Implementation of
CEJOIN
andECJOIN
in impl_22.cpp shows thatECJOIN
is actually faster thanCEJOIN
. Explain why this is the case in a real-world system.
Solution
Given the processor is using fast caches, the inner loop of ECJOIN
will be faster than CEJOIN
due to cache locality.
Since |enrolled| > |courses|
, the inner loop of ECJOIN
will have better cache locality, as it will be accessing the same memory locations more frequently. Additionally, ECJOIN
’s outer loop will only run once, therefore we can expect ECJOIN
to be faster comparing to CEJOIN
, since CEJOIN
will have to access enrolled
each iteration, which would be slower comparing to accessing courses
(since every item in enrolled
might not be cached).
P2.3
The measurements in the above figure have a few sudden jumps, e.g., at 1 500 000, 2 500 000, 4 500 000, 8 500 000, and 17 000 000. Explain what causes these jumps.
Solution
These jumps are probably caused by the cache size and cache associativity.
As the data size increases, the number of elements might not fit into processor’s L1, L2, L3 cache, thus, causing more frequent cache misses. At these points, the dataset is so large such that the processor might have to access on slower main memory, which induces these bump we observed from the graph.
We can also expect context-switching overhead given the code might be running in multiple processes or threads. However, the implementation implies that this code is running in a single thread, so we can rule out context-switching overhead.
P2.4
Write an algorithm that efficiently computes the same result as
CEJOIN
andECJOIN
in all-case . In the design of your algorithm, you may require that either enrolled or courses is ordered. Argue why your algorithm is correct and why it has the runtime complexity we specified.
Solution
Assuming courses
is sorted and ordered by (prog, code)
pair, the following algorithm can be used:
With the following binary search algorithm:
In all cases this would yield a runtime complexity of
Correctness
- Sorting
course
, has time complexity of , which will be executed once. - The binary search has time complexity of , which will be executed for each iteration of
enrolled
. - The for loop iterate each element in
enrolled
once, which has time complexity of .
Thus, the total time complexity is .