# The Bipartite Matching Problem (Part I)

I was reading about Covering-Packing Dualities that have been found in graphs and I came across the Minimum Vertex Cover and its packing dual, Maximum Matching. I thought I’d write about the Bipartite Matching Problem here. That way I’d remember it more clearly too. This was one of the first problems I read that allowed me understand the richness of the process of algorithm design. That’s why this post is in parts each of which represents the significant steps in the design of the algorithm. I’ll start (in this one) with a short description of the background required. Part II will be a description of the problem itself. Part III will describe the algorithm used to solve it and I will conclude in the last part with a slightly detailed analysis of the algorithm itself.

So lets begin…
Often we use graphs wherein the edges are “pipes” that carry “liquid” and the nodes are junctures where the pipes are joined together. Network graphs of this type have components such as capacities on the edges signifying the maximum liquid that they can carry at any given moment, source nodes that generate the traffic, target nodes that act as the destination points and of course, the liquid or traffic itself.  Formally, a Flow Network is a directed graph  $G {=} (V, E)$ where

• Each edge ‘e’ is associated with a capacity which is a non-negative number denoted by $c_{e}$.
• There are k pairs of source-target nodes $\{(s_{i}, t_{i})\}_{i} \in V$.
• There is a single target node $t \in V$.

An s-t flow is defined as a function f that maps each edge e to a non-negative real number, $f : E \rightarrow \textbf{R}^{+}$ and the value f(e) represents the amount of flow carried by edge e. Any flow f must also satisfy the following two conditions:

• (Capacity Constraints) For each edge $e\in E$ we have, $0\leq f(e) \leq c_{e}$.
• (Conservation constraints) For each node v, we have $latex \sum_{\text{e into v}} f(e)\text{ } {=} \sum_{\text{e out of v}} f(e)$

The value of the flow is denoted by v(f) and is defined to be the amount of flow generated at the source: $v(f)\text{ } {=} \sum_{\text{e out of s}} f(e)$.  The Maximum Flow problem, given a flow network, is to find a flow of maximum possible value by arranging the traffic so as to make as efficient use as possible of the available capacities.

Now there exist a two major algorithms to solve the max-flow problem namely, Ford-Fulkerson and PreFlow Push. We need these to solve our original problem. If you are already familiar with them, you can proceed directly to the next part. Otherwise, I suggest you go over them.