Problème 1.

A regional government wants to improve their existing infrastructure between a collection of towns TT. In specific, the government want to build a minimum number of roads such that there is a route from each town to each other town. The government has been advised by a dubious consultant that in the resulting road network, the number of users of a given road is independent of the presence of alternative routes. The regional government wants to minimise the number of roads it has to built to ensure that one can travel from one town to the other. Furthermore, the government wants to maximize the benefits of the road network by maximizing the number of users of the roads built. Hence, the government wants to only build roads that are expected to be used often. To help the construction plans, the government has asked the dubious consultant to estimate, for each pair of cities, the number of road users that would use the road between these two cities (if that road was built). Now the regional government is looking for a construction plan for a minimum number of roads connecting all towns that see the highest total usage among them.

P1.1

Model the above problem as a graph problem: What are the nodes and edges in your graph, do the edges have weights, and what problems are you trying got answer on your graph?

  • Nodes: Each town in the set of town TT is represented as a node in the graph. Denote the set of nodes as VV
  • Edges: Each potential road can be built between two pair of town. Denote it as EE. Each edge weights represents the estimated number of users who would use the road should it is constructed. Denote the weight of an edge between nodes ii and jj as w(i,j)w(i, j).

The problem can then be modelled as given a weighted undirected graph G=(V,E)\mathcal{G} = (V, E) representing towns and potential roads, find a maximum weight minimum spanning tree of GG.

Explanation:

  • We need to ensure there is a route from each town to every other town while minimising the number of roads being built, thus the minimum spanning tree of the graph.
  • Among all possible MSTs, we want to find the one with the maximum total edge weight, thus maximum weight MSTs.

P1.2

Provide an algorithm ConstructionPlan\text{ConstructionPlan} to find the minimum number of roads to build. Explain why your algorithm is correct.

Let w(x,y)w(x, y) be the weight of the edge between nodes xx and yy.

"\\begin{algorithm}\n\\caption{$\\text{ConstructionPlan}(G, T)$}\n\\begin{algorithmic}\n\\REQUIRE Graph $G = (T, E)$ with nodes $T$ representing towns and weighted edges $E$ representing potential roads with weights as estimated road usage\n\\ENSURE Set of edges $E'$ representing roads to build for maximum weight minimum spanning tree\n\\STATE $\\text{Adj} \\gets \\text{new list}[|T|]$ \\COMMENT{Adjacency list for graph}\n\\FOR{$(u, v, w) \\in E$}\n\\STATE $\\text{Adj}[u].\\text{append}((v, w))$\n\\STATE $\\text{Adj}[v].\\text{append}((u, w))$\n\\ENDFOR\n\\STATE $E' \\gets \\emptyset$\n\\STATE Pick an arbitrary node $s \\in T$\n\\STATE $V \\gets \\{s\\}$\n\\STATE $Q \\gets \\text{new min-priority queue}$ \\COMMENT{Priority queue of edges}\n\\FOR{$(v, w) \\in \\text{Adj}[s]$}\n\\STATE $Q.\\text{insert}((s, v, w))$\n\\ENDFOR\n\\WHILE{$|V| < |T|$}\n\\STATE $(u, v, w) \\gets Q.\\text{extract\\_max}()$ \\COMMENT{Get max weight edge}\n\\IF{$v \\notin V$}\n\\STATE $E' \\gets E' \\cup {(u, v)}$\n\\STATE $V \\gets V \\cup {v}$ \\COMMENT{Mark $v$ as visited}\n\\FOR{$(x, w) \\in \\text{Adj}[v]$}\n\\IF{$x \\notin V$}\n\\STATE $Q.\\text{insert}((v, x, w))$\n\\ENDIF\n\\ENDFOR\n\\ENDIF\n\\ENDWHILE\n\\RETURN $E'$\n\\end{algorithmic}\n\\end{algorithm}"

Algorithm 35 ConstructionPlan(G,T)\text{ConstructionPlan}(G, T)

Require: Graph G=(T,E)G = (T, E) with nodes TT representing towns and weighted edges EE representing potential roads with weights as estimated road usage

Ensure: Set of edges EE' representing roads to build for maximum weight minimum spanning tree

1:Adjnew list[T]\text{Adj} \gets \text{new list}[|T|] ▷Adjacency list for graph

2:for (u,v,w)E(u, v, w) \in E do

3:Adj[u].append((v,w))\text{Adj}[u].\text{append}((v, w))

4:Adj[v].append((u,w))\text{Adj}[v].\text{append}((u, w))

5:end for

6:EE' \gets \emptyset

7:Pick an arbitrary node sTs \in T

8:V{s}V \gets \{s\}

9:Qnew min-priority queueQ \gets \text{new min-priority queue} ▷Priority queue of edges

10:for (v,w)Adj[s](v, w) \in \text{Adj}[s] do

11:Q.insert((s,v,w))Q.\text{insert}((s, v, w))

12:end for

13:while V<T|V| < |T| do

14:(u,v,w)Q.extract_max()(u, v, w) \gets Q.\text{extract\_max}() ▷Get max weight edge

15:if vVv \notin V then

16:EE(u,v)E' \gets E' \cup {(u, v)}

17:VVvV \gets V \cup {v} ▷Mark vv as visited

18:for (x,w)Adj[v](x, w) \in \text{Adj}[v] do

19:if xVx \notin V then

20:Q.insert((v,x,w))Q.\text{insert}((v, x, w))

21:end if

22:end for

23:end if

24:end while

25:return EE'

This is the Prim’s algorithm for finding MSTs. We instead replace the minimum weight with maximum weights to fit with problem description in P1.1

==Correctness:==

Invariant: V contains all visited nodes,E contains edges of a maximum weight spanning tree over nodes in VV \text{ contains all visited nodes}, E' \text{ contains edges of a maximum weight spanning tree over nodes in } V

  • At L4, invariant holds since V={s}V = \{s\} and E=E' = \emptyset
  • per iteration, we add EE' with the maximum weight edge (u,v)(u, v) from any visited nodes uVu \in V to any unvisited node vTVv \in T \setminus V (L6-14). Then add vv to VV (L15), maintaining the invariant because:
    • Vnew=V{v}V_{\text{new}} = V \cup \{v\}
    • Enew=E{(u,v)}E'_{\text{new}} = E' \cup \{(u, v)\} is a maximum weight spanning tree over VnewV_{\text{new}} due to (u,v)(u, v) is the maximum weight edge connecting VV to TVT \setminus V.

The algorithm never create cycle in EE' since we only add edge from visited nodes to unvisited nodes, which cannot create a cycle.

==Bound function:==

Let w(E)w(E') be the total weight of edges in EE'. Per iteration, w(E)w(E') is the maximum possible weight of any spanning tree over the nodes in VV.

This holds initially when V={s}V=\{s\} and w(E)=0w(E') = 0. Per iteration, we add the maximum weight edge (u,v)(u, v) to EE' from VV to TVT \setminus V. Bound function is maintained because:

  • w(Enew)=w(E)+w(u,v)w(E)w(E'_{\text{new}}) = w(E') + w(u, v) \geq w(E')
  • Any other spanning tree TT' over V{v}V \cup \{v\} must contain an edge (x,y)(x,y) from VV to {v}\{v\}, and w(x,y)w(u,v)w(x,y) \leq w(u,v). Therefore, w(T)w(E{(u,v)})w(T') \leq w(E' \cup \{(u,v)\} )

The loop terminates when V=T|V| = |T|, i.e all nodes are visited.

Thus the algorithm is correct. \square

P1.3

Explain which graph representation you used for your algorithm and what the complexity of your algorithm is using this graph representation.

Uses adjacency list representation of the graph G=(T,E)G=(T,E). Since we store a list of edges for each nodes, or for each node uTu \in T, we maintain a list Adj[u]\text{Adj}[u] containing the node vv and weight w(u,v)w(u,v) for every edge (u,v)E(u,v) \in E.

The time complexity of the algorithm is O(T2logT+E)\mathcal{O}(|T|^2\log|T| + |E|):

  1. Initialising adjacency list: O(E)\mathcal{O}(|E|), where the adjacency list takes O(E)O(|E|) to populate
  2. L4: Run for T|T| iterations
  3. Inside the loop, find maximum weight edge takes from QQ takes O(logT)O(\log|T|). Therefore each iteration the while loop takes O(TlogT)O(|T|\log|T|)
  4. Adding extracted edge to EE' takes O(1)O(1)

Thus, the overall time complexity is O(T2logT+E)\mathcal{O}(|T|^2\log|T| + |E|). If the graph is complete, then E=T(T1)2=O(T2)|E| = \frac{|T|(|T|-1)}{2} = O(|T|^2), and the time complexity is O(T2logT)\mathcal{O}(|T|^2\log|T|)

The space complexity is O(T+E)O(|T| + |E|)

P1.4

What is the worst-case complexity of your solution if you use the other graph representation? Explain your answer

Worst case complexity of the algorithm using adjacency matrix representation is O(T2logT)\mathcal{O}(|T|^2\log|T|).

If adjacency matrix representation is used, or we use a T×T|T| \times |T| matrix Adj\text{Adj} to represent the graph, where Adj[u][v]=w\text{Adj}[u][v]=w for an edge (u,v)(u,v) with weight ww.

Initialising the matrix will take O(T2)O(|T|^2) time.

Adding edges from starting node ss to priority queue QQ will take O(TlogT)O(|T|\log|T|) time, as we need to scan row Adj[s]\text{Adj}[s] which has T|T| elements.

Inside the while loop, for loop (L18-23) that iterates over the edges of the newly visited node vv now takes O(TlogT)O(|T|\log|T|) time, as we need to scane the row that has T|T| entries, and for each unvisited node, we insert the edge into QQ which takes O(logT)O(\log|T|) time. Therefore, the while loop will take O(T(TlogT))=O(T2logT)O(|T| \cdot (|T|\log |T|)) = O(|T|^2\log|T|)

Thus, the total time complexity would be O(T2+T2logT)=O(T2logT)O(|T|^2 + |T|^2\log|T|) = O(|T|^2\log|T|)

Whereas the space complexity is O(T2)O(|T|^2). Could be more efficient for sparse graph where ET2|E| \ll |T|^2.


Problème 2.

A directed graph G=(N,E,s,t)\mathcal{G} = (\mathcal{N}, \mathcal{E}, s, t) with sNs \in \mathcal{N} the source and tNt \in \mathcal{N} the target is a series-parallel graph if it can be constructed inductively using the following rules:

  1. An elementary series-parallel graph is a single edge from ss to tt

  2. The series-construction. Let G1=(N1,E1,s1,t1)\mathcal{G}_1=(\mathcal{N}_1, \mathcal{E}_1, s_1, t_1) and G2=(N2,E2,s2,t2)\mathcal{G}_2=(\mathcal{N}_2, \mathcal{E}_2, s_2, t_2) be two series-parallel graphs with the node cc in common (N1N2={c}\mathcal{N}_1 \cap \mathcal{N}_2 = \{c\}). The graph G=(N1N2,E1E2)\mathcal{G}=(\mathcal{N}_1 \cup \mathcal{N}_2, \mathcal{E}_1 \cup \mathcal{E}_2) is a series-parallel graph

  3. The parallel-construction. Let G1=(N1,E1,s1,t1)\mathcal{G}_1=(\mathcal{N}_1, \mathcal{E}_1, s_1, t_1) and G2=(N2,E2,s2,t2)\mathcal{G}_2=(\mathcal{N}_2, \mathcal{E}_2, s_2, t_2) be two series-parallel graphs without nodes in common (N1N2=\mathcal{N}_1 \cap \mathcal{N}_2 = \emptyset) and let s,t(N1N2)s,t \notin (\mathcal{N}_1 \cup \mathcal{N}_2) be two fresh nodes. The graph G=(N1N2{s,t},E1E2{(s,s1),(s,s2),(t1,t),(t2,t)})\mathcal{G}=(\mathcal{N}_1 \cup \mathcal{N}_2 \cup \{s,t\},\mathcal{E}_1 \cup \mathcal{E}_2 \cup \{(s,s_{1}), (s,s_{2}), (t_{1},t), (t_{2},t)\}) is a series-parallel graph.

Now assume we have a series-parallel graph G=(N,E,s,t)\mathcal{G} = (\mathcal{N}, \mathcal{E}, s, t) with an edge-weight function weight:EZweight: \mathcal{E} \to \mathbb{Z} (here, Z\mathbb{Z} are the integers, which includes negative numbers). We note that series-parallel graphs are relatively simple structures.

P2.1

Write an algorithm to compute the single-source shortest paths from the source ss to all nodes nNn \in \mathcal{N} in O(N+E)\mathcal{O}(|\mathcal{N}|+|\mathcal{E}|) time

"\\begin{algorithm}\n\\caption{ShortestPathsSP($\\mathcal{G} = (\\mathcal{N}, \\mathcal{E}, s, t)$, $weight$)}\n\\begin{algorithmic}\n\\State $dist \\gets {}$ \\Comment{Initialize empty dictionary for distances}\n\\If{$\\mathcal{G}$ is an elementary graph (single edge $(s,t)$)}\n\\State $dist[s] \\gets 0$\n\\State $dist[t] \\gets weight(s,t)$\n\\ElsIf{$\\mathcal{G}$ is a series composition of $\\mathcal{G}_1$ and $\\mathcal{G}_2$}\n\\State $dist_1 \\gets ShortestPathsSP(\\mathcal{G}_1, weight)$\n\\State $dist_2 \\gets ShortestPathsSP(\\mathcal{G}_2, weight)$\n\\State $dist \\gets dist_1 \\cup dist_2$ \\Comment{Combine distances}\n\\For{each node $n$ in $\\mathcal{G}_2$}\n\\State $dist[n] \\gets dist[n] + dist_1[t_1]$\n\\EndFor\n\\ElsIf{$\\mathcal{G}$ is a parallel composition of $\\mathcal{G}_1$ and $\\mathcal{G}_2$}\n\\State $dist_1 \\gets ShortestPathsSP(\\mathcal{G}_1, weight)$\n\\State $dist_2 \\gets ShortestPathsSP(\\mathcal{G}_2, weight)$\n\\State $dist \\gets dist_1 \\cup dist_2$ \\Comment{Combine distances}\n\\State $dist[s] \\gets 0$\n\\State $dist[t] \\gets \\min(dist_1[t_1], dist_2[t_2])$\n\\EndIf\n\\RETURN $dist$\n\\end{algorithmic}\n\\end{algorithm}"

Algorithm 36 ShortestPathsSP(G=(N,E,s,t)\mathcal{G} = (\mathcal{N}, \mathcal{E}, s, t), weightweight)

1:distdist \gets {} ▷Initialize empty dictionary for distances

2:if G\mathcal{G} is an elementary graph (single edge (s,t)(s,t)) then

3:dist[s]0dist[s] \gets 0

4:dist[t]weight(s,t)dist[t] \gets weight(s,t)

5:else if G\mathcal{G} is a series composition of G1\mathcal{G}_1 and G2\mathcal{G}_2 then

6:dist1ShortestPathsSP(G1,weight)dist_1 \gets ShortestPathsSP(\mathcal{G}_1, weight)

7:dist2ShortestPathsSP(G2,weight)dist_2 \gets ShortestPathsSP(\mathcal{G}_2, weight)

8:distdist1dist2dist \gets dist_1 \cup dist_2 ▷Combine distances

9:for each node nn in G2\mathcal{G}_2 do

10:dist[n]dist[n]+dist1[t1]dist[n] \gets dist[n] + dist_1[t_1]

11:end for

12:else if G\mathcal{G} is a parallel composition of G1\mathcal{G}_1 and G2\mathcal{G}_2 then

13:dist1ShortestPathsSP(G1,weight)dist_1 \gets ShortestPathsSP(\mathcal{G}_1, weight)

14:dist2ShortestPathsSP(G2,weight)dist_2 \gets ShortestPathsSP(\mathcal{G}_2, weight)

15:distdist1dist2dist \gets dist_1 \cup dist_2 ▷Combine distances

16:dist[s]0dist[s] \gets 0

17:dist[t]min(dist1[t1],dist2[t2])dist[t] \gets \min(dist_1[t_1], dist_2[t_2])

18:end if

19:return distdist

P2.2

Explain why your algorithm is correct

Base case: For an elementary graph with one edge (s,t)(s,t), it sets dist[s]=0dist[s]=0 and dist[t]=weight(s,t)dist[t]=weight(s,t), which is the shortest path from ss to tt.

If G\mathcal{G} is a series-composition: Let G1\mathcal{G}_1 and G2\mathcal{G}_2 be the two series-parallel graphs with common node cc.

  • It recursively computes the shortest paths from ss to all nodes in G1\mathcal{G}_1 and G2\mathcal{G}_2, store them in dist1dist_1 and dist2dist_2 respectively.
  • For any node nG2n \in \mathcal{G}_2, the shortest path from ss \mu st go through target t1t_1 of G1\mathcal{G}_1. Thus, we update dist[n]=dist[n]+dist1[t1]dist[n] = dist[n] + dist_1[t_1].

If G\mathcal{G} is a parallel-composition: Let G1\mathcal{G}_1 and G2\mathcal{G}_2 be the two series-parallel graphs with new source ss and target tt

  • It recursively computes the shortest paths from ss to all nodes in G1\mathcal{G}_1 and G2\mathcal{G}_2, store them in dist1dist_1 and dist2dist_2 respectively.
  • The new source ss has distance 0, correctly set by dist[s]=0dist[s]=0
  • The new target tt has distance min(dist1[t1],dist2[t2])\min(dist_1[t_1], dist_2[t_2]), which is the shortest path from ss to tt.

In both cases, it combines the distance using union operation, and the algorithm is correct. \square

P2.3

Explain which graph representation you used for your algorithm and why your algorithm has the stated complexity

It uses adjacency list representation of the graph G=(N,E,s,t)\mathcal{G} = (\mathcal{N}, \mathcal{E}, s, t). Each node maintains a list of its outgoing edges. Since in series-parallel graphs, each node has at most 2 outgoing edges, the adjacency list representation is efficient. Additionally, the algorithm recursively traverse the graph, and finding outgoing edges takes constant time.

For base case, the algorithm takes O(1)O(1) time to set the distance for the elementary graph.

In both cases, it calls itself on G1\mathcal{G}_1 and G2\mathcal{G}_2, taking O(N1+E1)O(|\mathcal{N}_1| + |\mathcal{E}_1|) and O(N2+E2)O(|\mathcal{N}_2| + |\mathcal{E}_2|) time respectively. Then it updates the distance for each node in G2\mathcal{G}_2 in O(N2)O(|\mathcal{N}_2|) time. Thus, the time complexity is O(N1+E1+N2+E2)O(|\mathcal{N}_1| + |\mathcal{E}_1| + |\mathcal{N}_2| + |\mathcal{E}_2|)

Since each node is visited exactly once during the traversal, the time complexity is O(N+E)O(|\mathcal{N}| + |\mathcal{E}|) (since N1+N2=N|\mathcal{N}_1| + |\mathcal{N}_2| = |\mathcal{N}| and E1+E2=E|\mathcal{E}_1| + |\mathcal{E}_2| = |\mathcal{E}|)

P2.4

What is the worst-case complexity of your solution if you use the other graph representation? Explain your answer

If using adjacency matrix representation, the worst-case complexity of the algorithm is O(N2)O(|\mathcal{N}|^2).

It will take O(N2)O(|\mathcal{N}|^2) time to initialise the adjacency matrix.

The analysis of the algorithm remains the same. The difference here is that finding outgoing edges for a node will take O(N)O(|\mathcal{N}|) time in worst-case, as it will have to traverse the entire row of the matrix corresponding to that node. Since the ops is performed on each node during traversal, total complexity would be O(N2)O(|\mathcal{N}|^2).