Wednesday 27 January 2016

Basic Question of Dyanmic Programming : Matrix Chain Multiplication

Matrix Multiplication

A quick dropback on what matrix multiplication is (although you should know it already :P ) :






M11 = r11× t11  +  r12× t21  +   r13×t31
M12 = r21× t11  +  r22× t21   +  r23×t31



  So basically given two matrices of p x q and r x s. Then they can only be multiplied iff q=r.  The dimension of resultant matrix C is p x s.  

The time to compute the matrix C is O(p*q*r) because we multiply q elements of each row of A with q elements of each column of B. This takes q time. One row of A is multiplied with each column of B this takes q*s time. As the number of rows in A is p, q*s will takes place p times. Hence the total complexity is p*q*s.


MATRIX CHAIN MULTIPLICATION :


What the hell is this "chain"?


lets say there are n matrices. The dimension of ith matrix is p(i-1) x p(i) (except for the first one). Its something like this :

1st :  p0 x p1
2nd : p1 x p2
3rd : p2 x p3
4th : p3 x p4
                                                                        .
                                                                        .
                                                                        .

You get the basic idea.

So observe that you can multiply any ith matrix with (i+1)th matrix because the row of ith is same as column of (i+1)th.
Matrix Chain multiplication is multiplying these matrices to give the final matrix of dimension p1 x pn.

M1 x M2 x M3 x M4... 

As only two matrices can be multiplied at one time, there can be many combinations in which these can be multiplied. Like if we have 4 matrices, there can be these combinations :

( M1( M2( M3M4 ) ) )
( ( M1( M2M3) ) M4 )
( M1 ( (M2M3) M4) )
( (M1M2) (M3M4) ) 

The cost of computation of each of these combinations will be different and obviously we want the minimum cost.

The number of combinations can be computed as such :

P(n)    = 1                                              , if n=1
        = summation of ( P(k)P(n-k) )      k=[1,n-1]     ,if n != 1

If we compute all the combinations of Parentheses, the complexity would be appox. O(2^n) which would take billions of years to run for large values of n.


So Here's where Dynamic Programming kicks in like a boss!


The 2 conditions of DP is Optimal substructures and Overlapping subproblems.
Now keeping this in mind, we approach using DP.

0. To Calculate Ai...j, optimally we divide it into 2 subproblems and recursively find the solution.

1. Consider a range from i to j, say A i...j. Now if we break it into 2 matrices and then multiply, i.e. first compute A i...k and A k+1...j. Then multiply them to get A i...j.

Ai..j = (A i..k) * (A k+1...j)

2. For some value of k between i and j-1, cost of one of the Ai...k * A k+1...j will be minimum. To find that min, we calculate the value for each k from i to j-1.


3. Now finding Ai..k is a subproblem and can similarly be found by recursion as we found    A i...j, by breaking into parts Ai..p and Ap+1...k and finding the minimum of the possibilities.

Recursify it !

 
So if we say m[i..j] stores the minimum of A i..j then we can say :

m[i..j] = m[i..k] + m[k+1..j] + p (i-1)*p (k)*p(j)

Store all the k's for each pair of i..j in s[i..j].

So simple! isn't it? :P

 
 
 Simple code of the recursive function :

 void func(int l[], int i, int j ) {
        
              if(i==j) return 0;    // Multiplying a matrix with itself
              int mn = INF;                   

              for ( int c = i+1 , c < j, c++ ) {
                     mn = min ( func (l,i,c)   +                              func (l,c+1 , j) +                               p [i-1] * p [c] * p [j] ,
                                mn ) ;
              }  
             return mn;
       }


BUT same subproblems are being computed over and over again! As can be seen in the diagram above. Computing same over and over again increases complexity. We can cache the subproblems when we solve them to use again later with recomputing or use bottom up DP like this :

int MatrixChainOrder(int p[], int n)
{
 
    /* For simplicity of the program, one extra row and one extra column are
       allocated in m[][].  0th row and 0th column of m[][] are not used */
    int m[n][n];
 
    int i, j, k, L, q;
 
    /* m[i,j] = Minimum number of scalar multiplications needed to compute
       the matrix A[i]A[i+1]...A[j] = A[i..j] where dimention of A[i] is
       p[i-1] x p[i] */
 
    // cost is zero when multiplying one matrix.
    for (i = 1; i < n; i++)
        m[i][i] = 0;
 
    // L is chain length. 
    for (L=2; L<n; L++)  
    {
        for (i=1; i<=n-L+1; i++)
        {
            j = i+L-1;
            m[i][j] = INT_MAX;
            for (k=i; k<=j-1; k++)
            {
                // q = cost/scalar multiplications
                q = m[i][k] + m[k+1][j] + p[i-1]*p[k]*p[j];
                if (q < m[i][j])
                    m[i][j] = q;
            }
        }
    }
 
    return m[1][n-1];
}

* GeeksForGeeks * ( Too lazy to write again :P )



Tuesday 26 January 2016

P vs NP : The million dollar question

Before getting to the million dollar question of this millennium (Seriously, Clay Mathematics Institute is offering $1 million to whoever solves it ), lets make ourselves familiar with the fancy terms P, NP, NP-Hard, NP-Complete.

First, let's understand what NP is.

Suppose you stumble upon a Sudoku puzzle that you just can't seem to solve. Then, a friend walks up to you and boasts, "Haha! I know the answer to the problem!" Amazed, you tell your friend, "Prove it!" Your friend fills out the empty spaces in your Sudoku puzzle, and behold, upon a quick glance, you've verified that the answer is correct. "How did you do that?" you asked your friend. He responds, "It doesn't matter ;)"

Sudoku is an NP problem because it you can "quickly" verify a solution to a Sudoku problem. (By "quickly", I really mean "in polynomial time.") It doesn't matter if you can actually figure out the solution quickly or not. All that matters for a problem to be in NP is if a solution can be verified quickly.


Now keep this little story in mind while reading these definitions :

P (Polynomial Time Running ) 


Consider linear search problem (searching a number in a list of numbers). You can find this number in linear time or O(n). Or consider binary search, that takes O(logn) time or a problem that can be solved in quadratic time O(n^2) time. 
All these problems have one thing in common, they can be "SOLVED" in polynomial time. 
if the algorithm runs in linear time or quadratic time or cubic time, then we say the algorithm runs in polynomial time and the problem it solves is in class P.



NP ( Non deterministic Polynomial Time) 


So in the story above, SOLVEing a problem may not have taken polynomial time, who knows what algorithm he used to solve the Sudoku. However, to "VERIFY" that he solved the Sudoku correctly, we  can do that in polynomial time! 
This means there is not necessarily a polynomial-time way to find a solution, but once you have a solution it only takes polynomial time to verify that it is correct. So if the solution to a problem can be checked in polynomial time, it is called NP problem.

NP is called "Non deterministic" because of Turning Machine. So this guy, Turing, invented a machine that can solve decision problems. In other words, if you feed the machine the problem and the solution, it can tell in polynomial time if the solution is correct or not (YES/NO). 

P.S. : 
I used to confuse it with Non Polynomial Time ( which would mean it can't be solved in polynomial time), a big blunder!


NP-HARD 


These are the rouge problems, like James Bond who dont like to play by the rules. NP-Hard problems need not be NP i.e. some problems that cant be even verified in polynomial time, they may or may not take PTIME to verify the solution. Unlike P and NP that are decision problems, NP Hard can be Decision, Search or Optimisation problems. These are called "Hard" because they are almost as hard as NP Complete problems. ( discussed next ).


NP Complete


-> NP
-> NP-Hard

As we know NP-Hard may or may not take PTIME to verify, those NP-Hard who takes PTIME are called NP-Complete. 
It has been proven that any NP-Complete problem can be converted to any other in PTIME using many-one reduction techniques!!! This leads to a very interesting theory...
  ___________________________________________
| Problem Type | Verifiable in P time | Solvable in P time |
____________________________________________              
| P                    |             Yes             |        Yes                |                                            
| NP                  |             Yes             |     Yes or No *       |                                            
| NP-Complete |              Yes             |      Unknown         |                                            
| NP-Hard         |           Yes or No **  |      Unknown ***     |                                            
____________________________________________


 

" ONE PROBLEM TO RULE THEM ALL ! "

 



Solve one, just one, any one! problem in PTIME, and rest can be solved automatically in PTIME. Sounds like magic, doesn't it? :D

Science behind the magic : Since any NP-complete problem can be reduced to any other NP-complete problem in polynomial time, all NP-complete problems can be reduced to any NP-hard problem in polynomial time. Then, if there is a solution to one NP-hard problem in polynomial time, there is a solution to all NP problems in polynomial time.

This leads to the ultimate debate, if NP problems could be solved in PTIME, is it true that...
P = NP ?

Its clear that P is a subset of NP. The think that isn't clear is if NP can be solve d in PTIME. cause if they can be, P would indeed be equal to NP.


Examples :


Travelling Salesman Problem (TSP) is a very famous real world problems that questions simply, If there are n cities connected to each other, let there be a salesman in city 1. Given the time between each pair of cities, Whats the minimum time such that he visits each city exactly once returns back to 1?

No efficient solution to this, but approximate optimal solutions are there.

TSP is a NP-complete problem because of two properties : 

1. No efficient solution in PTIME exists, but can be verified in PTIME.
2. Every instance of hamiltonian path is reducible to a TSP problem and it has been proven that Hamiltonian path problem is NP complete, TSP is NP complete.


References :


1. Stackoverflow
2. Quora

Saturday 16 January 2016

Minimum Spanning tree, its Applications and some questions

Minimum Spanning Tree


I'll start off with the same cliched example that's given everywhere. 

Consider a city with N power houses in a city with M buildings. Given the distance between each building, your job is to connect all of the power houses with the minimum amount of wire. The wire between two power houses goes via other residential buildings.

Here's where MST kicks in. In a generic form, consider V (a set of v pins) connected with E edges. Then connect all the pins with the minimum amount of total distance.



MST problem can be solved using greedy approach i.e. at every step, we greedily choose an edge and at the end, we get the global optimal distance. So its like a "growing" tree!

The basic Algorithm for getting MST is :

A= empty
while A is not a MST
      Find a Safe edge for A
      A = A U { (u,v) }
return A

A is a growing MST of the given graph A. Safe edge can be defined as :

Consider a cut S of the graph G, the edge between S and G-S that has the minimum cost is called the safe edge i.e. edge with one vertex in the cut and the other in the rest of the graph with min cost.

At each step, we find the safe edge for the cut S and add it to the growing MST.

There are various Algorithms for finding the MST in a graph. The earliest one is Borůvka's algorithm
and takes mlogn time. The most common are Prim's and Kruskal who's algorithm and complexity and proof of correctness we'll discuss now. 

All these three are greedy algorithms. Since they run in polynomial time, the problem of finding such trees is in FP ( set of function problems which can be solved by a deterministic Turing machine in polynomial time), and related decision problems such as determining whether a particular edge is in the MST or determining if the minimum total weight exceeds a certain value are in P.

KRUSKAL ALGORITHM

Algorithm

A forest in a graph is a collection of multiple trees in a single graph. 
Consider the given graph as a forest with V trees (each vertex being a tree). Now take the edge with minimum weight, say (u,v). Check if u and v belong to the same tree. If not, join the two trees with this edge. Now take the next min edge and repeat till a MST connecting all the vertices is obtained. 

A = empty
FOR all the edges i
     MAKE-SET(i)
sort all the edges in ascending order
while A is not a MST
      take the min edge (u,v)
      if ( FIND-SET(u) != FIND-SET(v) )  
           A = A U { (u,v) }
           Union (u,v)
return A 

Complexity

The complexity depends on the data structure used to implement Set functions (Make, Find and Union). If Disjoint set data structure is used for Set functions, then log v for union and find.

Making set takes E*O(1) time. Sorting takes ElogE time. 2*E times find set is run and each takes logV time, so 2*E*logV. Union is done at most E times, so 2*E*logV

E + ElogE + 2*E*logV + 2*E*logV 

As |E| < |V|^2, Complexity is approximately ElogV.   

If we use union-by-rank and path compression heuristics, we get ElogV complexity.

PRIM'S ALGORITM

Algorithm

In Prim's, we maintain a set A and add safe edges to it till we obtain a MST.
To implement this, we use a parameter "key" and "parent" with every node.
Initially, the key of all the nodes is infinty and parent is NIL. Then we take a source vertex, make the key as 0 and keep parent as NIL. Now iteratively, add the vertex u with minimum key in A and update the keys of all the adjacent vertices (v) as min of distance between u and v and v.key. Then update the parent of v as u. This stops when we have considered all the vertices.
  
FOREACH u of G.V
       u.key = INF
       u.parent = NIL
root.key=0
Q = G.V
while Q != empty
     u = EXTRACT-MIN (Q)
     FOREACH v adj to u
           if( v.key >dist(u,v) & v belongs to Q)
           v.key = dist(u,v)
           v.parent = u

At the end, the MST is : A = {(v,v.parent), v belongs to {V-{R}}

Complexity

If we use a heap to implement the Min Queue Q, extraction takes log(V) and building it takes O(V) time.

Initialisation takes O(V), building Q takes O(V), the while loop runs V times, each taking approx O(logV) time, the forloop runs E times and time taken to check if v belong to V takes logV time, so forloop takes ElogV time.

The total time is approx O(ElogV + VlogV)which is approx ElogV as |E| <= |V|^2


Tuesday 5 January 2016

Reachibility #SCC #Kosaraju SPOJ : CAPCITY And TOUR

Basic Concept :

Find the nodes reachable from all other nodes or similarly if a node can reach all other nodes. We'll take the former as the standard problem. 

TOUR

Aakhir ye question chahta kya hai?

Find all the players that can win if there is a little corruption in the committee that decides the schedule.

Consider a directed graph where the nodes are the players and the edge denotes the player that can beat the other. 

Now the nodes that can reach all other nodes is the one that can beat all other players.

So the question boils down to finding the nodes that can reach all other nodes.

FINDING NODES THAT CAN REACH ALL OTHERS

Observations:

1. If a node can reach all other nodes, then all the nodes in that Strongly Connected Component can reach every nodes. 

2. Now after finding all the connected components, find the degree of incoming edges of each CC. Based on the number of CC with 0 in-degree, there can be 3 observations :

    2.1 0 CC with 0 in-degree
          
           Ok, think like this, there are two CC's A and B. Now we say that A can reach any other CC. Then A can reach B. BUT there cannot be a path from B to A as if there is a path from A to B and from B to A, then they are a part of the same CC. So if B is reachable from A, there can be only ONE path between them. So in-degree of A has to be 0.
  
         By this logic, 0 in-degree is a requirement for reachibility.

   2.2 More than 1 with 0 in-degree

         From the above mentioned logic, if there are more than one CC's with 0 in-degree, and suppose there is a CC that can reach all, then that CC has 0 in-degree. Now take A and B. A is the reacher and B is another CC with 0 in-degree. 
         Now here's the problem. If A can reach all, then it should also reach B ! But B has 0 in-degree i.e. no one can get reach the nodes in that CC, not even A ! Ergo, A can't reach all.
       
        Hence, there can't be a CC that reaches all if there are more than 1 CC with 0 in-degree.

  2.3 Exactly 1 CC with 0 in-degree

         Assume there is only one CC with 0 in-degree. So the rest of the CC's have more than 0 in-degree i.e. there is some CC that can reach it, and there is some CC that can reach it... and so on till the CC with 0 in-degree reach it.

         What I want to say is that, Say there are 4 CC's and D is the one with 0 in-degree. Let B can reach A and C can reach B and D can reach C. < D -> C -> B -> A >.

      Hence D can reach all. In general case, the CC with 0 in-degree can reach all others. Hence that CC is the one we are looking for !

 Summary 

 In this question look for the CC's with 0 in-degree. If they are 0 or more than 1, then no reachable CC exists. If exactly one CC exists, then all the nodes of that CC can reach all the other CC of the graph.

 CAPCITY

CAPCITY is almost exactly the same as TOUR.

ALGORITHM :

1. Apply Kosaraju's algorithm to find the strongly connected components. ( I've written a well detailed post on this algo before)

2. Calculate the indegree of each Connected Component.
        REP(i,1,n){
        for(auto it:graph[i]){
            if(comp[i]!=comp[it]) indeg[comp[it]]++;
        }
        }

3. Find the number of CC's with 0 in-degree.

4. If more than 1 or 0, then none exist.

5. If exactly one CC is there, then the print the number of nodes of that CC.