Overview
Going Office is definitely one of the tougher problems on InterviewStreet, but the main idea* behind the problem is simple: given a weighted undirected graph G=(V,E) and two nodes s,t∈V, we wish to answer Q queries of the following form:
What is the shortest path from s to t given that edge e∈E is removed from the graph?(This type of problem is also known as the "most vital arc" problem -- see the first footnote below.)
Let N=|V| and M=|E| as usual. In this case, we are guaranteed that N,M,Q≤200000.
Prerequisites
You should definitely know Dijkstra's algorithm before attempting to solve this problem. Also, knowing about segment trees with lazy propagation would be good, but is not required to understand the solution.
Building Bridges
First of all, it should be clear that naively running Dijkstra's isn't going to run in time, because it requires O(QNlogN) operations. In fact, the size of the input suggests that we should look for at worst an O(NlogN) solution. What can we do then? We may first try to solve an easier problem: can we determine whether an edge e, when removed, will increase the shortest path length between s and t?
Definitions:
- An edge is optimal if it is on a shortest path from s to t.
- An edge is a bridge* if it is on all shortest paths from s to t. (Here we refer only to shortest paths in G, without having removed any edges.)
Let ds(u) (resp. dt(u)) be the shortest-path distance from s (resp. t) to u, and let OPT be the shortest-path distance from s to t. Then we can show the following properties:
Lemmata:(Proofs are left to the reader.) These properties should give you some idea of how to find all the bridges in G. If you still need some hints, an algorithm for computing bridges is included at the end of this post.
- e=(u,v) is optimal if and only ds(u)+length(e)+dt(v)=OPT.
- e=(u,v) is a bridge if and only if it is optimal and for all other optimal e′=(u′,v′), we have either ds(v′)≤ds(u) or ds(v)≤ds(u′). In other words, when we travel along an optimal path from s to t, then between lengths ds(u) and ds(v), we must be traveling along e. By convention we denote a bridge with the letter b.
- As a corollary of the above statement, e is a bridge if and only if, when removed from G, the shortest path distance from s to t increases.
Islands
Now we have a way of identifying which edges, when removed, affect the shortest path length. This still leaves one major question unanswered: given a bridge b, how exactly do we determine the shortest path length from s to t once b is removed? Running Dijkstra's repeatedly is too slow, since we can easily construct examples in which every edge is a bridge. (Consider a graph consisting of only a path from s to t.)
Let K denote the number of bridges. Note that Lemma 2 above implies that we can uniquely order the bridges b0,b1,…,bK−1 based on their relative distances from s. Could we use this to help us?
Definition: Let the i-th island be defined as the set of all vertices v, such that there exists a shortest path from s to v using no more than i bridges. (If v is not connected to either s or t, then we can ignore it -- it lies on some strange continent somewhere far away.)The intuition behind islands and bridges is this: bridges are the fastest way to "travel" between the islands when going from s to t. If we remove an edge that is not a bridge, our shortest path is not affected. However, when we take out a bridge, we are forced to travel along another (possibly much longer) path.
Proposition: Consider the bridge bi, connecting the i-th and i+1-st islands. Let Ei be the set of all edges e connecting an island with index ≤i to an island with index >i. Then the shortest path from s to t that bypasses bi must have length min
Proof: Any path P from s to t bypassing b_i must include some edge in E_i . Now consider some edge e = ( u, v ) \in E_i -- the shortest path from s to t that goes through e must have length d_s( u ) + length( e ) + d_t( v ) . Lastly, by the way we have constructed the islands, we know that b_i is not on the shortest paths from s to u or from v to t .
A Data Structure for Bypass Length
Let bypass( i ) denote the length of the shortest path that bypasses b_i . The above observations suggest that we can run the following algorithm to compute bypass( i ) :
For each non-bridge edge e = ( u, v ) connecting islands i and j , where i < j :However, there is a slight problem with this algorithm: it's too slow. Through some tricky analysis, you can show that in the worst case, we can construct a graph that will cause the above subroutine to take O( M^{3/2} ) operations.For each k \in \{ i, i + 1, \ldots, j - 1 \} :bypass( k ) \leftarrow \min \{ bypass( k ), d_s( u ) + length( e ) + d_t( v ) \}
What we need is a data structure that supports the following two operations efficiently:
- update( i, j, val ) , which runs bypass( k ) \leftarrow \min \{ bypass( k ), val \} for all k \in \{ i, i+1, \ldots, j-1 \}
- query( i ) , which returns bypass( i )
One possible data structure that can support both operations in O( \log K ) time is called a segment tree with lazy propagation or a segment tree with range update. There are several tutorials online about this type of tree, and it's a good data structure to be familiar with. (The main problem right now is that the structure is rather complicated and the online tutorials are not very good. I may do an article on it some time, if there is sufficient demand.)
However, if you are lazy, there is another data structure called the Kempe tree, which is a bit simpler and can support the same sort of queries, also using O( \log K ) operations. Check the link for more information.
Putting It All Together
The previous observations suggest the following algorithm for solving the problem:
- Compute d_s and d_t using Dijkstra's algorithm.
-
Find the bridges:
- First find all the optimal edges by iterating over all edges e = ( u, v ) , and storing the edges such that d_s( u ) + length( e ) + d_t( v ) = OPT .
- Sort all the optimal edges by d_s( u ) and d_s( v ) . Use the criterion from above to find the bridges.
-
Find the islands.
- One way (the hard way) is to run a modified version of Dijkstra.
- A smarter and shorter alternative is to just run multiple depth-first searches from s and from the far end of each bridge. It's up to the reader to figure out how to do so.
-
Find the bypassing paths.
- Initialise your favourite range-update data structure.
- Iterate over all the non-bridge edges. If e = ( u, v ) is an edge crossing from island i to island j > i , then range-update bypass(i), bypass(i+1), \ldots, bypass(j-1) such that they are no more than d_s( u ) + length( e ) + d_t( v ) .
- Process all the queries. If the edge in question is not a bridge, then we simply return the optimal length. Otherwise, we query the range-update data structure to find the best bypass length.
The solution took me around 150 lines to implement with a decent amount of spacing and a relatively clean coding style. I'm somewhat suspicious, though, that I've overcomplicated the solution, and that a nicer solution exists. Message me or comment here if you have any other alternatives.
Footnotes
- See this paper for an example of previous work on the subject.
- See also this page.