Maximum Flow Algorithms
Source file:  mf.c


Consider a directed network, G, consisting of m edges and n vertices, where
each edge e in the network has a capacity c(e).  A capacity c(e) specifies the
maximum amount of flow f(e) that an edge e allows.  This can be viewed as a
network of pipes through which fluid flows.  For the maximum flow
problem, the flow originates from a source vertex, s, and flows through the
network to a sink vertex, t.  An assignment of flows to edges is legal if it
obeys edge capacities and conserves the flow through the network.  For a legal
assignment of edge flows we have the following restrictions:

  1.  For every edge, e, in G, 0 <= f(e) <= c(e).
  2.  The flow through the network is conserved, so the flow going out vertex
      s must equal the flow going into vertex t.
  3.  For any vertex, v, not equal to s or t, the flow going into v equals the
      flow going out of v.  

The total flow, F, is the total amount of flow between s and t, going through
the network.  In other words, F is equal to the total amount of flow out of s,
or equivalently, the total amount of flow into t.  The maximum flow problem
involves computing a legal assignment of flows to all edges in the network such
that the total flow, F, is maximum.


The Ford and Fulkerson Algorithm
--------------------------------

The Ford and Fulkerson algorithm requires that the edge flows on the path be
limited to integers, which is suitable for most applications.  In Ford and
Fulkerson's maximum flow algorithm, the concept of an augmenting path is used
to increase the flow.  An augmenting path consists of a series of `useful'
edges which define a path through G that connects s and t.  A `useful' edge e
connecting vertex v to vertex u is defined as follows:

      1.  A forward edge v --e--> u, where f(e) < c(e).
          That is, the flow through e can be increased by d(e) = c(e) - f(e).

  or  2.  A backward edge v <--e-- u, where f(e) > 0.
          That is, the flow through e can be decreased by d(e) = f(e).
          In directed terms the flow from u to v can be decreased, but the flow
          from v to u (which is negative) can be increased.

Note that the augmenting path allows edges of either direction to be used, as
long as the edges are useful.  For example, an augmenting path of useful edges
may look like this:

   s ---> v1 ---> v2 <--- v3 ---> v4 <--- v5 <--- v6 ---> t


The algorithm begins with some legal assignment of edge flows to the network.
For example, all edge flows can initially be zero.  The algorithm is described
below:

   1. Perform a search starting at s to find an augmenting path to t.
      Typically, this is a depth first-search using the useful edges in the
      network.
   2. If no augmenting path was found then the current flow is maximum. stop.
   3. Otherwise the flow through edges on the path is increased as follows:
       3.1.  Compute l = min(d(e1), d(e2), ..., d(ek)) where the augmenting
             path consists of the edges e1, e2, ..., ek.
       3.2.  For each edge e on the augmenting path
                - if e is forward then f(e) = f(e) + l.
                - if e is backward then f(e) = f(e) - l.
   4.  Goto step 1.


The algorithm first finds an augmenting path.  If no augmenting path can be
found, then it is not possible to increase the flow between s and t, which
means that the flow must be maximum, and the algorithm stops.  In step 3.1 the
maximum amount of increase in flow that the augmenting path allows is computed.
Then, in step 3.2 the flow through edges on the augmenting path is increased.
The algorithm then tries to search for another augmenting path.

If the edge flows are always integers, Ford and Fulkerson's algorithm adds and
subtracts but never divides.  On each cycle, the total flow, F, will be
increased by at least 1.  Since there is an upper bound on the total flow, the
algorithm will terminate, but it is difficult to express the time complexity of
the algorithm.  Consider the network below:


         (a)          
        ^ | \
      M/  |  \M
      /   |   >
    (s)  1|   (t)
      \   |   ^
      M\  |  /M
        > v /
         (b)

Assume edge cost M is a very large integer.  The algorithm can alternate
between the augmenting paths s-a-b-t and s-b-a-t, requiring 2M augmentations
before the maximum flow is computed.  This problem can be overcome by using
breath first search to find the augmenting paths, and always using a shortest
augmenting path.  This allows the algorithm to terminate in O(mn^3) time.


The Dinic Algorithm
-------------------

Like the Ford and Fulkerson algorithm, the Dinic algorithm uses augmenting
paths to improve the flow in the network.  However, the number of augmentations
required by the Dinic algorithm is limited by the number of vertices and edges
in the graph.  This allows the Dinic algorithm to have a worst case time
complexity of O(mn^2).

The basic process of the Dinic algorithm is outlined below.  The Dinic
algorithm uses the same concept of `useful' edges that was discussed in the
description of the Ford and Fulkerson algorithm:

 1.  The Dinic algorithm constructs a `layered network' representing the
     `useful' edges between s and t.  The `layered network' concept is
     described in detail below.
 2.  If a layered network of `useful' edges between s and t could not be
     constructed then the total flow is maximum; stop.
 3.  Edge flows in the layered network are updated using augmenting paths to
     give a `maximal' flow in the layered network.
 4.  The flows in the layered network are then used to update the flows in the
     main network.
 5.  Repeat from step 1.


The construction of the `layered network' can be described simply as follows:

    - Initially, set V(0) contains vertex s.  Layer 1 uses `useful edges'
      connected to vertex s, and set V(1) contains the vertices reached.

    - In general:  Layer i+1 uses all useful edges connecting vertices in set
                   V(i) to vertices which are not in sets V(0), V(1), ...,
                   V(i).  That is set V(i+1) consists of unvisited vertices
                   that are reached using useful edges off vertices in V(i).

    - For the final layer, layer k, set V(k) contains vertex t.

The source file mf.c represents the layered network using a simple directed
graph structure, with the same vertex numbering as used in the main network.
The directed edges in the layered network all point from set V(i) to set
V(i+1).  This simplifies the code of the augmenting path search algorithm.  The
layered network is then built up as follows.

  1.  Initially, set V(0) contains vertex s, and i = 0.  Mark s as "visited"
      and all other vertices as "unvisited".

  2.  Directed edges (u,v) corresponding to useful edges between vertices u in
      V(i) and unvisited vertices v are added to the layered network.  All such
      vertices v are added to set V(i+1).

      Recall that a useful edge, e, may point in either direction;  that is,
      u--e-->v or u<--e--v.  As in the Ford and Fulkerson algorithm, a useful
      edge e has an associated value d(e) which is the maximum amount of
      increase in flow that the edge allows.

      For a useful edge, e, the directed edge (u,v) added to layered network is
      assigned a capacity c'(e)=d(e), and a flow f'(e)=0.  Edge (u,v) also has
      a pointer field, copy, which points to edge e in the main network, and a
      boolean field, reverse, which records the direction of edge e. 

  3.  Increment i.

  4.  If set V(i) is empty, then the current flow is maximum, halt.

  5.  If set V(i) contains t, then the final layer was constructed, halt.

  6.  Return to step 2.


Note that some edges in the layered network may be useless since they lead to
dead ends and won't be involved in an augmenting path to t.  However, this will
not affect the correctness or time complexity of the search for augmenting
paths in the layered network.  Such edges could be removed, but this would take
more time than it does to just traverse them during the search for augmenting
paths.

The search for augmenting paths in the layered network is simpler, and much
more efficient than the search for augmenting paths in the Ford and Fulkerson
algorithm.  Edges in the layered network have a boolean field, blocked, which
marks the edge as either "blocked" or "unblocked".  Initially all edges are
marked as "unblocked".  Unblocked edges are then traversed in a depth first
search to locate an augmenting path.  When a the search reaches a dead-end and
back-tracks over an edge, that edge is marked as "blocked".  When an augmenting
path is found, edge flows f'(e) are updated.  If a flow f'(e) reaches the edge
capacity c'(e), then that edge is marked as "blocked".  After updating flows,
the depth-first search is restarted from vertex s.

The augmenting path search, and updating of edge flows in the layered network
is described by the pseudo-code below.  In this description OUT(v) denotes the
set of outgoing edges of vertex v.  All edges in the layered network are
forward edges, and the value d'(e) in the description below is equal to
c'(e) - f'(e) and denotes the maximum amount of increase in edge flow allowed
through edge e.

   1.  Initially all edges are marked as "unblocked", and stack S is empty.
   2.  v = s.
   3.  Search for the next unblocked edge in OUT(v).
   4.  If no unblocked edge exists then
         4.1.  If S is empty, then the present flow is maximal, halt.
         4.2.  Pop the edge u--e-->v of S, and mark e as "blocked".
         4.3.  Resume searching from Step 3 with v = u.
   5.  Else
         5.1.  Put the unblocked edge v--e-->u onto S, and let v = u.
         5.2.  If v is not equal to t then goto Step 3.
   6.  Since v=t, the edges on s form an augmenting path.  The flow through
       edges on the path is increased as follows:
         6.1.  Compute l = min(d'(e1), d'(e2), ..., d'(ek)) where the
               augmenting path consists of the edges e1, e2, ..., ek.
         6.2.  For each edge e on the augmenting path f'(e) = f'(e) + l and if
               f'(e) == c'(e) mark e as "blocked".
         6.3.  Goto Step 2.


Once the maximal flow in the layered network has been computed, the flows of
corresponding edges in the main network are updated as follows:

    For each edge e in the layered network:
        If e->reversed == true then
             f(e->copy) = f(e->copy) - f'(e)
        Else
             f(e->copy) = f(e->copy) + f'(e)


The process
  - construct layered network (if flow is not maximum).
  - compute maximal flow.
  - update flows in the main network.
is called a `phase'.  It can be proved that the number of phases is bounded by
n since the length of the layered network must increase with each phase.  Since
the algorithm for finding a maximal flow in the layered network is O(mn), the
total time complexity of the algorithm is O(mn^2).


The MPM Algorithm
-----------------

This algorithm by Malhotra, Pramodh Kumar, and Maheshwari (MPM) has a time
complexity of O(n^3), which is better than the O(mn^2) time complexity of
Dinic's algorithm.  Like Dinic's algorithm, the MPM algorithm computes the
maximum flow by using a layered network approach:

  - construct layered network (if flow is not maximum).
  - compute maximal flow in the layered network.
  - update flows in the main network.

The layered network is constructed in exactly the same way as for the Dinic
algorithm; refer to the description for the Dinic algorithm.  The MPM algorithm
differs from Dinic in the way that the maximal flow through the layered network
is computed.

The MPM algorithm computes maximum flows by defining flow potentials for each
vertex in the layered network.  First, the concept of potential flows will be
described.  We will use the same notation for edge flow and edge capacity as
was used for Dinic's algorithm.

  - The potential of an edge, e, in the layered network is defined as
    c'(e) - f'(e); that is, the maximum amount of increase in flow that
    edge e can take.  Here c'(e) is the capacity of edge e, and f'(e) is the
    flow assigned to edge e.

  - The in-potential IP(v) of a vertex, v, in the layered network is defined as
    the, sum of the potential of all incoming edges w--e-->v.
    Expressed mathematically:
        IP(v) = Sum (over all edges w--e-->v) { c'(e) - f'(e) }

  - The out-potential OP(v) of a vertex, v, in the layered network is defined
    as the, sum of the potential of all outgoing edges v--e-->w.
    Expressed mathematically:
        IP(v) = Sum (over all edges v--e-->w) { c'(e) - f'(e) }

  - The potential P(v) of a vertex, v, in the layered network is defined as
    the smaller of IP(v) and OP(v).
    Expressed mathematically:
        P(v) = Min(IP(v), OP(v))
    The source vertex, s, and sink vertex, t, are exceptions, for which the
    potential is defined as P(s) = OP(s) and P(t) = IP(t).

In other words, a potential P(v) is the maximum increase in flow that can pass
through vertex v; based upon the maximum increase in flow allowed by the edges
connected to v.

Suppose that a vertex, v_min, is selected such that P(v_min) is minimum among
all vertices in the layered network.  Then an increase of P(v_min) in the total
flow is possible, through some flow path passing through v_min.  This increase
is possible because P(v) >= P(v_min) for all other vertices v in the layered
network, meaning that all other vertices can allow a flow increase of at least
P(v_min).  The edge flows in the network need to be updated to reflect the
increase of P(v_min) in the total flow.

Edge flows in the network are updated by pushing the additional flow from
v_min to t, and then pulling the flow to v_min from s.  If v_min happens to be
s or t, then only the pushing or the pulling of flow will occur respectively.
To push and pull flows, we define an out-flow, OF(v), and in-flow, IF(v),
for each vertex v in the layered network.  Initially OF(v) and IF(v) are zero
for all vertices except v_min, for which OF(v_min) = IF(v_min) = P(v_min).

*** Pushing Edge Flows

The flow is pushed toward t layer by layer.  A list, S, holds vertices in the
current layer that contain flow being pushed toward t, and a list S2 holds
vertices in the next layer that the flow reaches.  Initially S = {v_min} and
S2 = {}.    The algorithm proceeds by pushing flow through edges from vertices
in S to vertices in the next layer.  Vertices that the flow hits in the next
layer are added to S2.  After the flow has been pushed from all vertices in S,
the algorithm assigns S = S2 and S2 = {}.  This process is repeated until
S = {t}.

Consider a vertex, u in S.  The code below shows how the flow is pushed through
edges outgoing from u, and the in-potential and out-potential of affected
vertices updated.  

Note:  This code excludes the manipulation of S2.

1.  Take the next "unused" edge u--e-->w;
2.  p = c'(e) - f'(e);   /* Edge potential. */
3.  if OF(u) >= p then {
      /* The available out-flow will saturate this edge's capacity. */
4.      OF(u) = OF(u) - p;
5.      OF(w) = OF(w) + p;
6.      OP(u) = OP(u) - p;
7.      IP(w) = IP(w) - p;
8.      f'(e) = c'(e);
9.      Mark e as "used";       /* This edge has been saturated. */
10.     Goto 1;             /* There may still be out-flow available from u. */
11.  }
     /* else */
     /* Any available out-flow will be used up by the edge. */
13.  OF(w) = OF(w) + OF(u);
14.  OP(u) = OP(u) - OF(u);
15.  IP(w) = IP(w) - OF(u);
16.  f'(e) = f'(e) + OF(u);
     /* There is no need to update OF[u] as it is no longer needed. */ 

***

*** Pulling Edge Flows

The flow is pulled from s layer by layer.  A list S holds of vertices in the
current layer containing flow being pulled from s, and a list S2 holds vertices
in the back layer that flow was pulled from.  Initially S = {v_min} and
S2 = {}.  The algorithm proceeds by pulling flow from edges of vertices in the
back layer to vertices in S.  Vertices that the flow was pulled from are added
to S2.  After the flow has been pulled from all vertices in S, the algorithm
assigns S = S2 and S2 = {}.  This process repeats until S = {s}.

Consider a vertex, u in S.  The code below shows how the flow is pulled through
edges incoming to u, and the in-potential and out-potential of affected
vertices updated.  

Note:  This code excludes the manipulation of S2.

1.  Take the next "unused" edge w--e-->u;
2.  p = c'(e) - f'(e);   /* Edge potential. */
3.  if IF(u) >= p then {
      /* The available out-flow will saturate this edge's capacity. */
4.      IF(u) = IF(u) - p;
5.      IF(w) = IF(w) + p;
6.      IP(u) = OP(u) - p;
7.      OP(w) = IP(w) - p;
8.      f'(e) = c'(e);
9.      Mark e as "used";       /* This edge has been saturated. */
10.     Goto 1;             /* There may still be out-flow available from u. */
11.  }
     /* else */
     /* Any available out-flow will be used up by the edge. */
13.  IF(w) = OF(w) + OF(u);
14.  IP(u) = OP(u) - OF(u);
15.  OP(w) = IP(w) - OF(u);
16.  f'(e) = f'(e) + IF(u);
     /* There is no need to update OF[u] as it is no longer needed. */ 

***

Note:

  It is important that the next unused outgoing/incoming edge of any vertex can
  be selected efficiently (i.e. O(1) time).  Suppose that incoming and outgoing
  edges of vertices in the layered network are stored in linked lists.  For
  each vertex in the layered network the algorithm maintains pointers to the
  next unused incoming edge and the next unused outgoing edge.  After an edge
  becomes used, the pointer is updated in O(1) time to point to the next edge
  in the linked list.

The basic process of the MPM algorithm is summarised below.  Any vertex that
has, or ends up with, a potential of zero, can be removed from the layered
network.  There may be dead-end paths in the layered network, which were there
at the start, or have resulted after the removal of a vertex.  Vertices at the
end of a dead-end path have a potential of zero.  This will cause any dead-end
paths that cannot reach s or t to be removed from the layered network.  When a
vertex with a non-zero potential is selected, it is guaranteed that there are
no dead-ends in the network.  Thus, all the pushed/pulled flow eventually
arrives at the sink/source vertices.

1.  calculate P(v)=Min(IP(v),OP(v));
2.  select the minimum vertex, v_min;
3.  if P(v_min) == 0 then {
4.      if v_min == s or v_min == t then halt.  /* Flow is maximal. */
5.      remove(v_min);
6.      goto 1.
7.  }
8.  Push/Pull flow P(v_min) from/to v_min.  /* Update edge flows. */
9.  remove(v_min);
10. goto 1;

The function remove(v) removes a vertex from the layered network, and updates
the potential of neighbouring vertices to reflect this.  This may result in
some neighbouring vertices potential going to zero, which will in-turn cause
them to be removed.

1.  remove(v) {
2.      while there exists a next unused edge u--e-->w do {
3.          IP(w) = IP(w) - (c'(e) - f'(e));
4.          mark e as used;
5.      }
6.      while there exists a next unused edge w--e-->u do {
7.          OP(w) = OP(w) - (c'(e) - f'(e));
8.          mark e as used;
9.      }
10.     Remove v from the layered network.
11. }

Since edges are marked as blocked, it is sufficient to just mark removed
vertices as used, and ignore used vertices when selecting the minimum vertex.
This keeps the implementation of edge lists simple.  In order to actually
remove vertices and edges, double-linked lists would be need to be used.

Eventually the potential of the source vertex, s, or the sink vertex, t,
becomes zero.  In this situation, the flow is maximal because the source or
sink is not able to accommodate an increase in flow.  In the MPM algorithm,
s or t will eventually be selected as v_min with P(v_min) = 0.  Suppose that
P(v_min) was non-zero when v_min was equal to s or t.  Then the removal of
either s or t, eventually causes the potential of the other to become zero
after removing every vertex in the graph.

The time complexity when computing the maximal flow can be broken down as
follows:

 - n x [Time for locate-min]
     
 - n x [At most n vertices involved in push/pull]
        Note:  For each vertex involved in push/pull (excluding the minimum
               vertex) there is at most one edge access without saturating the
               edge.

               That is, only the last edge visited may be left unsaturated.

 - m edge saturations
        Note each edge is saturated only once.

So we have m + n^2 + n x [time for locate-min] = O(n^2), where the time allowed
for locate-min can be at most O(n).  Thus, locate-min can be implemented easily
by scanning a one-dimensional array of P(v) values, spending O(n) time.

Although locating the minimum P(v) in this way seems inefficient, the O(n) time
complexity is still contained within the time complexity for pushing and
pulling flows, so will not impact upon the overall time complexity.  Holding
P(v) values in an unordered one dimensional array has the advantage that there
is no time overhead associated with decreasing P(v) values.  In comparison, a
sophisticated priority queue data structure, such as a Fibonacci heap, would
lower the time for locating the minimum P(v), but at least have have some
constant time overhead associated with the O(1) time complexity needed for
decreasing P(v) values.

With O(n^2) worst case time being required for computing the maximal flow, the
maximum flow can be computed in O(n^3) time since the number of phases (same
as for Dinic algorithm) is bounded by n.


The Karzanov Algorithm
----------------------

The Karzanov algorithm has the same O(n^3) time complexity as the MPM
algorithm, but is more difficult to describe.  It takes a different approach
when computing the maximal flow in the layered network.  The basic idea is to
force as much flow as possible, from the source vertex s, through the edges of
the layered network, to the sink vertex t.  This leaves some vertices v with
excess flow.  Excess flow occurs where the flow arriving through incoming edges
of v exceeds the capacity available for flow on outgoing edges of v.  To
correct this, the algorithm must rebalance any excess flow by pushing it back
one layer, and attempting to force it forward again.  Incoming edges of
saturated vertices are blocked during rebalancing, so that the excess flow will
not be pushed through them again.  The rebalance/push process is repeatedly
applied at the forward-most layer with excess flow.  Any remaining excess flow
that cannot be pushed forward will slowly make its way backward through the
layered network and eventually arrive back at the source vertex s.  This leaves
a maximal flow in the layered network.