Month: November 2013

The Bipartite Matching Problem (Part II)

Now that we are acquainted with the necessary background, lets try and formulate the problem with enough mathematical precision so that we can ask concrete questions and start thinking about algorithms to solve it. We set up the Bipartite Matching Problem as follows:

Let X {=} \{x_{1}, x_{2},...,x_{n}\} and Y {=} \{y_{1}, y_{2},...,y_{n}\} be two sets and let their Cartesian product X \times Y , represent the set of all possible ordered pairs of the form (x,y) , where x\in X and y\in Y . Then, a matching S can be defined as a set of ordered pairs, each from X \times Y with the property that each member of X and each member of Y appears in at most one pair in S. A perfect matching is a set S’ such that each member of X and each member of Y appear in exactly one pair in S’.

Recall the definition of a bipartite graph  G = (V, E). It is an undirected graph whose vertex set can be partitioned as V = X \cup Y with the property that every edge e \in E has one vertex in X and one vertex in Y. Consequently, in the case of graph, the edges can be viewed as pairs of nodes and we say that a matching in a graph G = (V, E), is a set of edges M \subseteq E with the property that each vertex appears in at most one edge of M. A set of edges M is called a perfect matching if each vertex of the graph appears in exactly one edge of M.
The Bipartite Matching Problem is that of finding a matching in G of the largest possible of size.

As a side note, the Stable Matching Problem and the Gale-Shapley algorithm proposed to solve it, is particularly interesting too. It is in fact the original version of the current problem.


The Bipartite Matching Problem (Part I)

I was reading about Covering-Packing Dualities that have been found in graphs and I came across the Minimum Vertex Cover and its packing dual, Maximum Matching. I thought I’d write about the Bipartite Matching Problem here. That way I’d remember it more clearly too. This was one of the first problems I read that allowed me understand the richness of the process of algorithm design. That’s why this post is in parts each of which represents the significant steps in the design of the algorithm. I’ll start (in this one) with a short description of the background required. Part II will be a description of the problem itself. Part III will describe the algorithm used to solve it and I will conclude in the last part with a slightly detailed analysis of the algorithm itself.

So lets begin…
Often we use graphs wherein the edges are “pipes” that carry “liquid” and the nodes are junctures where the pipes are joined together. Network graphs of this type have components such as capacities on the edges signifying the maximum liquid that they can carry at any given moment, source nodes that generate the traffic, target nodes that act as the destination points and of course, the liquid or traffic itself.  Formally, a Flow Network is a directed graph  G {=} (V, E) where

  • Each edge ‘e’ is associated with a capacity which is a non-negative number denoted by c_{e} .
  • There are k pairs of source-target nodes \{(s_{i}, t_{i})\}_{i} \in V .
  • There is a single target node t \in V.

An s-t flow is defined as a function f that maps each edge e to a non-negative real number, f : E \rightarrow \textbf{R}^{+} and the value f(e) represents the amount of flow carried by edge e. Any flow f must also satisfy the following two conditions:

  • (Capacity Constraints) For each edge e\in E we have, 0\leq f(e) \leq c_{e} .
  • (Conservation constraints) For each node v, we have $latex \sum_{\text{e into v}} f(e)\text{ } {=} \sum_{\text{e out of v}} f(e) $

The value of the flow is denoted by v(f) and is defined to be the amount of flow generated at the source: v(f)\text{ } {=} \sum_{\text{e out of s}} f(e) .  The Maximum Flow problem, given a flow network, is to find a flow of maximum possible value by arranging the traffic so as to make as efficient use as possible of the available capacities.

Now there exist a two major algorithms to solve the max-flow problem namely, Ford-Fulkerson and PreFlow Push. We need these to solve our original problem. If you are already familiar with them, you can proceed directly to the next part. Otherwise, I suggest you go over them.