Saturday 16 January 2016

Minimum Spanning tree, its Applications and some questions

Minimum Spanning Tree


I'll start off with the same cliched example that's given everywhere. 

Consider a city with N power houses in a city with M buildings. Given the distance between each building, your job is to connect all of the power houses with the minimum amount of wire. The wire between two power houses goes via other residential buildings.

Here's where MST kicks in. In a generic form, consider V (a set of v pins) connected with E edges. Then connect all the pins with the minimum amount of total distance.



MST problem can be solved using greedy approach i.e. at every step, we greedily choose an edge and at the end, we get the global optimal distance. So its like a "growing" tree!

The basic Algorithm for getting MST is :

A= empty
while A is not a MST
      Find a Safe edge for A
      A = A U { (u,v) }
return A

A is a growing MST of the given graph A. Safe edge can be defined as :

Consider a cut S of the graph G, the edge between S and G-S that has the minimum cost is called the safe edge i.e. edge with one vertex in the cut and the other in the rest of the graph with min cost.

At each step, we find the safe edge for the cut S and add it to the growing MST.

There are various Algorithms for finding the MST in a graph. The earliest one is Borůvka's algorithm
and takes mlogn time. The most common are Prim's and Kruskal who's algorithm and complexity and proof of correctness we'll discuss now. 

All these three are greedy algorithms. Since they run in polynomial time, the problem of finding such trees is in FP ( set of function problems which can be solved by a deterministic Turing machine in polynomial time), and related decision problems such as determining whether a particular edge is in the MST or determining if the minimum total weight exceeds a certain value are in P.

KRUSKAL ALGORITHM

Algorithm

A forest in a graph is a collection of multiple trees in a single graph. 
Consider the given graph as a forest with V trees (each vertex being a tree). Now take the edge with minimum weight, say (u,v). Check if u and v belong to the same tree. If not, join the two trees with this edge. Now take the next min edge and repeat till a MST connecting all the vertices is obtained. 

A = empty
FOR all the edges i
     MAKE-SET(i)
sort all the edges in ascending order
while A is not a MST
      take the min edge (u,v)
      if ( FIND-SET(u) != FIND-SET(v) )  
           A = A U { (u,v) }
           Union (u,v)
return A 

Complexity

The complexity depends on the data structure used to implement Set functions (Make, Find and Union). If Disjoint set data structure is used for Set functions, then log v for union and find.

Making set takes E*O(1) time. Sorting takes ElogE time. 2*E times find set is run and each takes logV time, so 2*E*logV. Union is done at most E times, so 2*E*logV

E + ElogE + 2*E*logV + 2*E*logV 

As |E| < |V|^2, Complexity is approximately ElogV.   

If we use union-by-rank and path compression heuristics, we get ElogV complexity.

PRIM'S ALGORITM

Algorithm

In Prim's, we maintain a set A and add safe edges to it till we obtain a MST.
To implement this, we use a parameter "key" and "parent" with every node.
Initially, the key of all the nodes is infinty and parent is NIL. Then we take a source vertex, make the key as 0 and keep parent as NIL. Now iteratively, add the vertex u with minimum key in A and update the keys of all the adjacent vertices (v) as min of distance between u and v and v.key. Then update the parent of v as u. This stops when we have considered all the vertices.
  
FOREACH u of G.V
       u.key = INF
       u.parent = NIL
root.key=0
Q = G.V
while Q != empty
     u = EXTRACT-MIN (Q)
     FOREACH v adj to u
           if( v.key >dist(u,v) & v belongs to Q)
           v.key = dist(u,v)
           v.parent = u

At the end, the MST is : A = {(v,v.parent), v belongs to {V-{R}}

Complexity

If we use a heap to implement the Min Queue Q, extraction takes log(V) and building it takes O(V) time.

Initialisation takes O(V), building Q takes O(V), the while loop runs V times, each taking approx O(logV) time, the forloop runs E times and time taken to check if v belong to V takes logV time, so forloop takes ElogV time.

The total time is approx O(ElogV + VlogV)which is approx ElogV as |E| <= |V|^2


No comments:

Post a Comment